热点
"训练加速" 相关文章
如果RL可预测,我们还需要把训练跑满吗?中科大揭示参数更新的线性秘密
PaperWeekly 2025-10-14T14:42:26.000000Z
What Makes Looped Transformers Perform Better Than Non-Recursive Ones (Provably)
cs.AI updates on arXiv.org 2025-10-14T04:17:45.000000Z
What Makes Looped Transformers Perform Better Than Non-Recursive Ones (Provably)
cs.AI updates on arXiv.org 2025-10-14T04:17:45.000000Z
What Makes Looped Transformers Perform Better Than Non-Recursive Ones (Provably)
cs.AI updates on arXiv.org 2025-10-14T04:17:45.000000Z
On Predictability of Reinforcement Learning Dynamics for Large Language Models
cs.AI updates on arXiv.org 2025-10-02T04:18:05.000000Z
SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts
cs.AI updates on arXiv.org 2025-09-30T04:04:14.000000Z
Improving Consistency Models with Generator-Augmented Flows
cs.AI updates on arXiv.org 2025-07-03T04:07:15.000000Z
在GSM8K上比GRPO快8倍!厦大提出CPPO,让强化学习快如闪电
掘金 人工智能 2025-04-01T10:57:46.000000Z
标点符号成大模型训练神器!KV缓存狂减一半,可处理400万Tokens长序列,来自华为港大等 | 开源
智源社区 2025-03-04T10:13:02.000000Z