热点
"注意力汇聚" 相关文章
Attention Sinks: A 'Catch, Tag, Release' Mechanism for Embeddings
cs.AI updates on arXiv.org 2025-09-23T06:11:49.000000Z
JADE 6.0 新鲜出炉!14款多模态大模型幻觉频发,长推理模型也集体翻车?
复旦白泽战队 2025-09-11T20:12:39.000000Z
Unveiling Attention Sinks: The Functional Role of First-Token Focus in Stabilizing Large Language Models
MarkTechPost@AI 2025-04-09T21:19:15.000000Z