热点
关于我们
xx
xx
"
跨模态对齐
" 相关文章
SEPS: Semantic-enhanced Patch Slimming Framework for fine-grained cross-modal alignment
cs.AI updates on arXiv.org
2025-11-05T05:30:30.000000Z
轻量高效,即插即用:Video-RAG为长视频理解带来新范式
机器之心
2025-10-20T14:13:56.000000Z
Topological Alignment of Shared Vision-Language Embedding Space
cs.AI updates on arXiv.org
2025-10-14T04:19:22.000000Z
TTOM: Test-Time Optimization and Memorization for Compositional Video Generation
cs.AI updates on arXiv.org
2025-10-10T04:13:52.000000Z
TTOM: Test-Time Optimization and Memorization for Compositional Video Generation
cs.AI updates on arXiv.org
2025-10-10T04:13:52.000000Z
TTOM: Test-Time Optimization and Memorization for Compositional Video Generation
cs.AI updates on arXiv.org
2025-10-10T04:13:52.000000Z
Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data
cs.AI updates on arXiv.org
2025-10-07T04:11:46.000000Z
小米开源首个原生端到端语音大模型 Xiaomi-MiMo-Audio
oschina.net
2025-09-19T02:47:36.000000Z
Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval
Hugging Face
2025-09-11T19:37:07.000000Z
Beyond Pixels: Introducing Geometric-Semantic World Priors for Video-based Embodied Models via Spatio-temporal Alignment
cs.AI updates on arXiv.org
2025-09-03T04:16:58.000000Z
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
cs.AI updates on arXiv.org
2025-07-11T04:04:26.000000Z
首篇多模态 RAG 全栈技术综述出炉~
PaperAgent
2025-02-24T16:22:53.000000Z
Ola: A State-of-the-Art Omni-Modal Understanding Model with Advanced Progressive Modality Alignment Strategy
MarkTechPost@AI
2025-02-18T06:03:17.000000Z
高效评估多模态预训练对齐质量,中科大提出模态融合率MIR
机器之心
2024-11-04T07:25:26.000000Z