热点
关于我们
xx
xx
"
跨模态理解
" 相关文章
HMVLM: Human Motion-Vision-Lanuage Model via MoE LoRA
cs.AI updates on arXiv.org
2025-11-05T05:30:35.000000Z
超越谷歌、Meta,360的FG-CLIP2为何能成为“全球最强图文模型”?
AI大模型工场
2025-11-04T16:29:32.000000Z
Visual Features Across Modalities: SVG and ASCII Art Reveal Cross-Modal Understanding
https://simonwillison.net/atom/everything
2025-10-25T03:31:07.000000Z
Visual Features Across Modalities: SVG and ASCII Art Reveal Cross-Modal Understanding
https://simonwillison.net/atom/everything
2025-10-25T03:31:07.000000Z
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
cs.AI updates on arXiv.org
2025-10-20T04:14:47.000000Z
Being-VL的视觉BPE路线:把「看」和「说」真正统一起来
机器之心
2025-10-09T04:21:35.000000Z
Being-VL的视觉BPE路线:把「看」和「说」真正统一起来
机器之心
2025-10-09T04:21:35.000000Z
从离散token到多模态统一:Discrete Tokenization全景综述重磅上线
PaperWeekly
2025-08-11T08:59:59.000000Z
集成500+多模态现实任务!全新MEGA-Bench评测套件:CoT对开源模型反而有害?
新智元
2024-11-16T14:16:08.000000Z
集成500+多模态现实任务,全新MEGA-Bench评测套件:CoT对开源模型反而有害?
36kr-科技
2024-11-15T07:36:44.000000Z