热点
"图像-文本理解" 相关文章
VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety
cs.AI updates on arXiv.org 2025-10-22T04:20:43.000000Z
MULTI: Multimodal Understanding Leaderboard with Text and Images
cs.AI updates on arXiv.org 2025-10-16T04:31:53.000000Z