热点
"模态差距" 相关文章
Rethinking the Text-Vision Reasoning Imbalance in MLLMs through the Lens of Training Recipes
cs.AI updates on arXiv.org 2025-10-28T04:04:02.000000Z
Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models
cs.AI updates on arXiv.org 2025-10-15T04:56:55.000000Z
Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models
cs.AI updates on arXiv.org 2025-10-15T04:56:55.000000Z
Uncovering Grounding IDs: How External Cues Shape Multi-Modal Binding
cs.AI updates on arXiv.org 2025-09-30T04:06:05.000000Z
Multimodal RAG Enhanced Visual Description
cs.AI updates on arXiv.org 2025-08-14T04:18:49.000000Z
KinMo: Kinematic-aware Human Motion Understanding and Generation
cs.AI updates on arXiv.org 2025-08-05T11:10:13.000000Z
GIIFT: Graph-guided Inductive Image-free Multimodal Machine Translation
cs.AI updates on arXiv.org 2025-07-25T04:28:43.000000Z