热点
"训练稳定性" 相关文章
Reg-DPO: SFT-Regularized Direct Preference Optimization with GT-Pair for Improving Video Generation
cs.AI updates on arXiv.org 2025-11-05T05:30:33.000000Z
VL Norm:让强化学习更稳、更快的关键一步
微软研究院AI头条 2025-10-22T17:17:47.000000Z
VL Norm:让强化学习更稳、更快的关键一步
微软研究院AI头条 2025-10-22T17:17:47.000000Z
AMiD: Knowledge Distillation for LLMs with $\alpha$-mixture Assistant Distribution
cs.AI updates on arXiv.org 2025-10-21T04:16:06.000000Z
Taming the Judge: Deconflicting AI Feedback for Stable Reinforcement Learning
cs.AI updates on arXiv.org 2025-10-20T04:09:15.000000Z
小米 AI 新论文,雷军千万年薪要挖的 DeepSeek“天才少女”罗福莉署名
IT之家 2025-10-16T04:50:13.000000Z
Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers
cs.AI updates on arXiv.org 2025-10-14T04:20:13.000000Z
Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers
cs.AI updates on arXiv.org 2025-10-14T04:20:13.000000Z
Stability of Transformers under Layer Normalization
cs.AI updates on arXiv.org 2025-10-14T04:16:25.000000Z
AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning
cs.AI updates on arXiv.org 2025-10-08T04:11:46.000000Z
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
cs.AI updates on arXiv.org 2025-10-07T04:16:36.000000Z
Interactive Training: Feedback-Driven Neural Network Optimization
cs.AI updates on arXiv.org 2025-10-03T04:18:56.000000Z
Stable Forgetting: Bounded Parameter-Efficient Unlearning in LLMs
cs.AI updates on arXiv.org 2025-09-30T04:06:16.000000Z
A Unified Noise-Curvature View of Loss of Trainability
cs.AI updates on arXiv.org 2025-09-25T05:48:01.000000Z
SimpleTIR:让大模型“边写代码边思考”不再崩溃
AI科技评论 2025-09-11T16:38:16.000000Z
螞蟻國產GPU訓練大模型細節曝光!Ling模型研發負責人發文詳解背後故事
富途牛牛头条 2025-03-27T10:54:58.000000Z
GAN归来:模型大幅简化,训练更稳定,逆袭扩散模型,AI社区疯传
我爱计算机视觉 2025-01-14T13:12:07.000000Z
Papers I’ve read this week, Mixture of Experts edition
Artificial Fintelligence 2024-10-22T06:07:41.000000Z
Analyzing the Impact of Flash Attention on Numeric Deviation and Training Stability in Large-Scale Machine Learning Models
MarkTechPost@AI 2024-05-10T16:27:41.000000Z