cs.AI updates on arXiv.org 10月13日 12:14
语音对话状态追踪策略比较研究
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文对基于语音-LLMs的端到端语音对话状态追踪中的上下文管理策略进行了比较研究,通过SpokenWOZ语料库实验,发现提供完整的语音对话作为输入在模型性能上显著优于传统方法,同时,基于注意力池化的语音历史压缩策略在保持准确性的同时显著减小了上下文规模。

arXiv:2510.09424v1 Announce Type: cross Abstract: This paper presents a comparative study of context management strategies for end-to-end Spoken Dialog State Tracking using Speech-LLMs. We systematically evaluate traditional multimodal context (combining text history and spoken current turn), full spoken history, and compressed spoken history approaches. Our experiments on the SpokenWOZ corpus demonstrate that providing the full spoken conversation as input yields the highest performance among models of similar size, significantly surpassing prior methods. Furthermore, we show that attention-pooling-based compression of the spoken history offers a strong trade-off, maintaining competitive accuracy with reduced context size. Detailed analysis confirms that improvements stem from more effective context utilization.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

语音对话状态追踪 上下文管理 语音-LLMs 性能提升 压缩技术
相关文章