热点
"RoPE" 相关文章
Mamba-3惊现ICLR 2026投稿:三重升级打满“推理优先”范式
PaperWeekly 2025-10-12T15:43:41.000000Z
Mamba-3惊现ICLR 2026投稿:三重升级打满“推理优先”范式
PaperWeekly 2025-10-12T15:43:41.000000Z
Wavelet-Induced Rotary Encodings: RoPE Meets Graphs
cs.AI updates on arXiv.org 2025-09-29T04:16:02.000000Z
KV缓存不再爆!清华姚期智团队重写注意力维度,长上下文更省更强 | NeurIPS 2025 Spotlight
PaperWeekly 2025-09-25T15:44:28.000000Z
Q-ROAR: Outlier-Aware Rescaling for RoPE Position Interpolation in Quantized Long-Context LLMs
cs.AI updates on arXiv.org 2025-09-19T04:34:54.000000Z
Length-Aware Rotary Position Embedding for Text-Speech Alignment
cs.AI updates on arXiv.org 2025-09-16T05:25:03.000000Z
Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
cs.AI updates on arXiv.org 2025-09-16T05:07:20.000000Z
小米大模型团队论文入选ACL 2025 SAC Highlights
小米技术 2025-09-12T08:06:06.000000Z
RoPE是长度外推之光,还是频谱灾难?真相埋在傅里叶里
PaperWeekly 2025-08-11T09:00:03.000000Z
Unifying Mixture of Experts and Multi-Head Latent Attention for Efficient Language Models
cs.AI updates on arXiv.org 2025-08-05T11:10:02.000000Z
位置编码RoPE介绍及其优化
掘金 人工智能 2025-07-23T09:59:49.000000Z
【手搓大模型】从零手写Llama3
掘金 人工智能 2025-07-18T06:08:17.000000Z
ICML 2025 | 长文本救星!清华等提出傅里叶位置编码,多项任务全面超越RoPE
PaperWeekly 2025-05-20T07:52:38.000000Z
ICML 2025 | 注意力机制中的极大值:破解大语言模型上下文理解的关键
机器之心 2025-05-06T07:41:38.000000Z
Transformers Gain Robust Multidimensional Positional Understanding: University of Manchester Researchers Introduce a Unified Lie Algebra Framework for N-Dimensional Rotary Position Embedding (RoPE)
MarkTechPost@AI 2025-04-15T02:50:34.000000Z
复旦NLP团队提出MHA2MLA框架,将任意大模型迁移至DeepSeek MLA
PaperWeekly 2025-03-07T13:06:38.000000Z
设计位置编码
智源社区 2024-12-04T07:19:29.000000Z
HuggingFace工程师亲授:如何在Transformer中实现最好的位置编码
机器之心 2024-11-27T05:54:17.000000Z
资讯 | 超越Attention:高级位置嵌入方法如何改进 Transformer 架构中的原始方法
智源社区 2024-11-02T04:23:33.000000Z