RoPE_Fishai

热点

"RoPE" 相关文章

Mamba-3惊现ICLR 2026投稿：三重升级打满“推理优先”范式

PaperWeekly 2025-10-12T15:43:41.000000Z

Mamba-3惊现ICLR 2026投稿：三重升级打满“推理优先”范式

PaperWeekly 2025-10-12T15:43:41.000000Z

Wavelet-Induced Rotary Encodings: RoPE Meets Graphs

cs.AI updates on arXiv.org 2025-09-29T04:16:02.000000Z

KV缓存不再爆！清华姚期智团队重写注意力维度，长上下文更省更强 | NeurIPS 2025 Spotlight

PaperWeekly 2025-09-25T15:44:28.000000Z

Q-ROAR: Outlier-Aware Rescaling for RoPE Position Interpolation in Quantized Long-Context LLMs

cs.AI updates on arXiv.org 2025-09-19T04:34:54.000000Z

Length-Aware Rotary Position Embedding for Text-Speech Alignment

cs.AI updates on arXiv.org 2025-09-16T05:25:03.000000Z

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

cs.AI updates on arXiv.org 2025-09-16T05:07:20.000000Z

小米大模型团队论文入选ACL 2025 SAC Highlights

小米技术 2025-09-12T08:06:06.000000Z

RoPE是长度外推之光，还是频谱灾难？真相埋在傅里叶里

PaperWeekly 2025-08-11T09:00:03.000000Z

Unifying Mixture of Experts and Multi-Head Latent Attention for Efficient Language Models

cs.AI updates on arXiv.org 2025-08-05T11:10:02.000000Z

位置编码RoPE介绍及其优化

掘金人工智能 2025-07-23T09:59:49.000000Z

【手搓大模型】从零手写Llama3

掘金人工智能 2025-07-18T06:08:17.000000Z

ICML 2025 | 长文本救星！清华等提出傅里叶位置编码，多项任务全面超越RoPE

PaperWeekly 2025-05-20T07:52:38.000000Z

ICML 2025 | 注意力机制中的极大值：破解大语言模型上下文理解的关键

机器之心 2025-05-06T07:41:38.000000Z

Transformers Gain Robust Multidimensional Positional Understanding: University of Manchester Researchers Introduce a Unified Lie Algebra Framework for N-Dimensional Rotary Position Embedding (RoPE)

MarkTechPost@AI 2025-04-15T02:50:34.000000Z

复旦NLP团队提出MHA2MLA框架，将任意大模型迁移至DeepSeek MLA

PaperWeekly 2025-03-07T13:06:38.000000Z

设计位置编码

智源社区 2024-12-04T07:19:29.000000Z

HuggingFace工程师亲授：如何在Transformer中实现最好的位置编码

机器之心 2024-11-27T05:54:17.000000Z

资讯 | 超越Attention：高级位置嵌入方法如何改进 Transformer 架构中的原始方法

智源社区 2024-11-02T04:23:33.000000Z

Copyright © 2019 FISHAI.All Rights Reserved