QTSplus
Query-Aware Tokenizer for Long-Video Multimodal Language Models.
Multimodal LLM
Hey there! Welcome to our team’s corner. We’re enthusiastic about Multimodal Large Language Models and explore ways to enhance interactions between language and image/video/audio.
Our research explores innovative ways to make AI systems better understand and generate multimodal content. We’re always on the lookout for practical techniques that improve capability and efficiency without sacrificing quality.