跨模态对齐_Fishai

热点

"跨模态对齐" 相关文章

SEPS: Semantic-enhanced Patch Slimming Framework for fine-grained cross-modal alignment

cs.AI updates on arXiv.org 2025-11-05T05:30:30.000000Z

轻量高效，即插即用：Video-RAG为长视频理解带来新范式

机器之心 2025-10-20T14:13:56.000000Z

Topological Alignment of Shared Vision-Language Embedding Space

cs.AI updates on arXiv.org 2025-10-14T04:19:22.000000Z

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

cs.AI updates on arXiv.org 2025-10-10T04:13:52.000000Z

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

cs.AI updates on arXiv.org 2025-10-10T04:13:52.000000Z

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

cs.AI updates on arXiv.org 2025-10-10T04:13:52.000000Z

Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data

cs.AI updates on arXiv.org 2025-10-07T04:11:46.000000Z

小米开源首个原生端到端语音大模型 Xiaomi-MiMo-Audio

oschina.net 2025-09-19T02:47:36.000000Z

Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval

Hugging Face 2025-09-11T19:37:07.000000Z

Beyond Pixels: Introducing Geometric-Semantic World Priors for Video-based Embodied Models via Spatio-temporal Alignment

cs.AI updates on arXiv.org 2025-09-03T04:16:58.000000Z

Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge

cs.AI updates on arXiv.org 2025-07-11T04:04:26.000000Z

首篇多模态 RAG 全栈技术综述出炉~

PaperAgent 2025-02-24T16:22:53.000000Z

Ola: A State-of-the-Art Omni-Modal Understanding Model with Advanced Progressive Modality Alignment Strategy

MarkTechPost@AI 2025-02-18T06:03:17.000000Z

高效评估多模态预训练对齐质量，中科大提出模态融合率MIR

机器之心 2024-11-04T07:25:26.000000Z

Copyright © 2019 FISHAI.All Rights Reserved