cs.AI updates on arXiv.org 09月17日
PixelHumor:评估LMMs幽默理解能力
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了PixelHumor,一个包含2800个标注的多格漫画数据集,旨在评估大型多模态模型(LMMs)在理解和识别多模态幽默和叙事序列方面的能力。实验结果显示,当前顶尖模型在面板排序任务上的准确率仅为61%,远低于人类表现,揭示了当前模型在视觉和文本线索整合方面的局限性。

arXiv:2509.12248v1 Announce Type: cross Abstract: Understanding humor is a core aspect of social intelligence, yet it remains a significant challenge for Large Multimodal Models (LMMs). We introduce PixelHumor, a benchmark dataset of 2,800 annotated multi-panel comics designed to evaluate LMMs' ability to interpret multimodal humor and recognize narrative sequences. Experiments with state-of-the-art LMMs reveal substantial gaps: for instance, top models achieve only 61% accuracy in panel sequencing, far below human performance. This underscores critical limitations in current models' integration of visual and textual cues for coherent narrative and humor understanding. By providing a rigorous framework for evaluating multimodal contextual and narrative reasoning, PixelHumor aims to drive the development of LMMs that better engage in natural, socially aware interactions.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

多模态模型 幽默理解 数据集 叙事分析 LMMs
相关文章