AHAMask：提升LALM的指令敏感性

cs.AI updates on arXiv.org 09月03日

AHAMask：提升LALM的指令敏感性

本文提出AHAMask，通过在LALM的解码器中掩码部分注意力头，实现无需指令的特定声学任务功能。实验证明，该方法在单任务或复合任务上均优于使用指令的方法，并揭示了LALM在注意力头中存在功能路径。

arXiv:2509.01787v1 Announce Type: cross Abstract: Although current large audio language models (LALMs) extend text large language models (LLMs) with generic acoustic understanding abilities, they usually suffer from instruction sensitivity, where different instructions of the same intention can yield drastically different outcomes. In this work, we propose AHAMask, where we simply mask some of the attention heads in the decoder-only LLM backbone of LALMs, to trigger specific acoustic task functionalities without instructions. These masks are efficiently obtained by training on an LALM, with the number of trainable parameters equal to the attention head count in its LLM backbone. We show by experiments that applying such selective attention head masks achieves comparable or even better performance than using instructions, either on single or composite tasks. Besides achieving reliable acoustic task specification for LALMs, this also reveals that LALMs exhibit certain "functional pathways" in their attention heads.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AHAMask LALM 指令敏感性声学任务注意力头

相关文章

大语言模型的组合关系推理基准测试与解析

大语言模型的组合关系推理基准测试与解析

Tracing Facts or just Copies? A critical investigation of the Competitions of Mechanisms in Large Language Models

Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models

Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

Decomposing Attention To Find Context-Sensitive Neurons

EMNLP 2025 | 拨云见日：知识电路分析揭示大语言模型“知识遮蔽”幻觉之源

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models