热点
"优化器" 相关文章
Exploring Landscapes for Better Minima along Valleys
cs.AI updates on arXiv.org 2025-11-03T05:19:11.000000Z
Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints
cs.AI updates on arXiv.org 2025-10-28T04:14:32.000000Z
Dopamine-driven synaptic credit assignment in neural networks
cs.AI updates on arXiv.org 2025-10-28T04:02:23.000000Z
SNOO: Step-K Nesterov Outer Optimizer - The Surprising Effectiveness of Nesterov Momentum Applied to Pseudo-Gradients
cs.AI updates on arXiv.org 2025-10-20T04:14:43.000000Z
Randomness and Interpolation Improve Gradient Descent
cs.AI updates on arXiv.org 2025-10-16T04:25:04.000000Z
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
cs.AI updates on arXiv.org 2025-10-13T04:14:24.000000Z
MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates
cs.AI updates on arXiv.org 2025-10-08T04:10:57.000000Z
Conda: Column-Normalized Adam for Training Large Language Models Faster
cs.AI updates on arXiv.org 2025-09-30T04:06:24.000000Z
$\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization
cs.AI updates on arXiv.org 2025-09-29T04:14:28.000000Z
We released Neural Network Console – Windows Version 1.40
Blog - Neural Network Console 2025-09-25T10:02:24.000000Z
从Muon到AdaMuon:下一代优化器能否真正取代Adam?
PaperWeekly 2025-09-17T02:10:12.000000Z
从Muon到AdaMuon:下一代优化器能否真正取代Adam?
PaperWeekly 2025-09-16T14:22:44.000000Z
Adam的Update RMS为何总是0.2?噪声模拟到理论近似全讲透
PaperWeekly 2025-09-13T01:34:13.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
机器之心 2025-09-11T16:05:32.000000Z
震撼实锤,清华姚班校友揭「1.4×加速」陷阱:AI优化器为何名不符实?
36kr 2025-09-08T03:15:50.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
36kr 2025-09-07T23:45:16.000000Z
Fantastic Pretraining Optimizers and Where to Find Them
cs.AI updates on arXiv.org 2025-09-03T04:17:43.000000Z
How to Cut Your AI Training Bill by 80%? Oxford’s New Optimizer Delivers 7.5x Faster Training by Optimizing How a Model Learns
MarkTechPost@AI 2025-08-29T09:09:13.000000Z
Kourkoutas-Beta: A Sunspike-Driven Adam Optimizer with Desert Flair
cs.AI updates on arXiv.org 2025-08-19T04:21:03.000000Z
ZetA: A Riemann Zeta-Scaled Extension of Adam for Deep Learning
cs.AI updates on arXiv.org 2025-08-06T04:02:05.000000Z