热点
关于我们
xx
xx
"
AdamW
" 相关文章
Robust Layerwise Scaling Rules by Proper Weight Decay Tuning
cs.AI updates on arXiv.org
2025-10-20T04:12:44.000000Z
Robust Layerwise Scaling Rules by Proper Weight Decay Tuning
cs.AI updates on arXiv.org
2025-10-20T04:12:44.000000Z
Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks
cs.AI updates on arXiv.org
2025-10-14T04:20:11.000000Z
REG: A Regularization Optimizer for Robust Training Dynamics
cs.AI updates on arXiv.org
2025-10-07T04:15:29.000000Z
从Muon到AdaMuon:下一代优化器能否真正取代Adam?
PaperWeekly
2025-09-17T02:10:12.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
机器之心
2025-09-11T16:05:32.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
36kr
2025-09-07T23:45:16.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
36kr
2025-09-07T23:45:16.000000Z
斯坦福:优化器「诸神之战」?AdamW 凭「稳定」胜出
机器之心
2025-09-07T06:48:49.000000Z
仅需一行代码即可提升训练效果!
掘金 人工智能
2025-06-09T02:53:46.000000Z
开源赛道太挤了!月之暗面开源新版Muon优化器
机器之心
2025-02-24T05:55:32.000000Z
Eliminating Fixed Learning Rate Schedules in Machine Learning: How Schedule-Free AdamW Optimizer Achieves Superior Accuracy and Efficiency Across Diverse Applications
MarkTechPost@AI
2024-11-15T08:20:00.000000Z