cs.AI updates on arXiv.org 10月01日
CCoT-Emo:音频语言模型情感推理新框架
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出了一种名为CCoT-Emo的框架,通过引入结构化的情感图(EGs)来指导大型音频语言模型(LALMs)进行情感推理,无需微调。实验表明,CCoT-Emo在语音情感识别(SER)任务上优于现有方法,并提高了零样本基线的准确性。

arXiv:2509.25458v1 Announce Type: new Abstract: Large audio-language models (LALMs) exhibit strong zero-shot performance across speech tasks but struggle with speech emotion recognition (SER) due to weak paralinguistic modeling and limited cross-modal reasoning. We propose Compositional Chain-of-Thought Prompting for Emotion Reasoning (CCoT-Emo), a framework that introduces structured Emotion Graphs (EGs) to guide LALMs in emotion inference without fine-tuning. Each EG encodes seven acoustic features (e.g., pitch, speech rate, jitter, shimmer), textual sentiment, keywords, and cross-modal associations. Embedded into prompts, EGs provide interpretable and compositional representations that enhance LALM reasoning. Experiments across SER benchmarks show that CCoT-Emo outperforms prior SOTA and improves accuracy over zero-shot baselines.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

音频语言模型 情感推理 CCoT-Emo 语音情感识别 零样本性能
相关文章