脑层感知与多模态语言解码

cs.AI updates on arXiv.org 11月05日 13:17

脑层感知与多模态语言解码

本文基于Meta对脑电信号与语言嵌入的研究，探讨了预训练模型中哪些层能最好地反映大脑的层次处理，通过对比wav2vec2和CLIP两种模型的嵌入，使用脑电图评估其与脑活动的关系，提出结合多模态、层感知表示可能有助于解码大脑理解语言的方式。

arXiv:2511.00065v1 Announce Type: cross Abstract: When we hear the word "house", we don't just process sound, we imagine walls, doors, memories. The brain builds meaning through layers, moving from raw acoustics to rich, multimodal associations. Inspired by this, we build on recent work from Meta that aligned EEG signals with averaged wav2vec2 speech embeddings, and ask a deeper question: which layers of pre-trained models best reflect this layered processing in the brain? We compare embeddings from two models: wav2vec2, which encodes sound into language, and CLIP, which maps words to images. Using EEG recorded during natural speech perception, we evaluate how these embeddings align with brain activity using ridge regression and contrastive decoding. We test three strategies: individual layers, progressive concatenation, and progressive summation. The findings suggest that combining multimodal, layer-aware representations may bring us closer to decoding how the brain understands language, not just as sound, but as experience.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

脑电信号语言解码预训练模型多模态表示脑层次处理

相关文章

SecWiki News 2024-06-02 Review

Path: A Machine Learning Method for Training Small-Scale (Under 100M Parameter) Neural Information Retrieval Models with as few as 10 Gold Relevance Labels

中国AI长卷（三）：算法生根

NetGPT：网络流量的生成式预训练Transfomer模型

工信部发布2024年第四批行业标准制修订计划

清华成果落地，领跑国产AI大模型！

大模型「六小虎」里，至少两家要放弃大模型了

全球首次，时序大模型突破十亿参数，华人团队发布Time-MoE，预训练数据达3000亿个时间点

How should we make trade-offs between the quantity and quality of labels used for eliciting knowledge from capable AI systems?

导航、采矿、建造，北大这个新智能体把《我的世界》玩透了