热点
"ScaleRL" 相关文章
How to scale RL
Interconnects 2025-10-20T16:39:57.000000Z
Meta用40万个GPU小时做了一个实验,只为弄清强化学习Scaling Law
机器之心 2025-10-20T14:24:54.000000Z
Meta用40万个GPU小时做了一个实验,只为弄清强化学习Scaling Law
机器之心 2025-10-20T14:24:54.000000Z
Meta花了420万美元、烧掉40万GPU·小时,只为验证一条Sigmoid曲线
PaperWeekly 2025-10-19T08:34:29.000000Z
Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs
MarkTechPost@AI 2025-10-18T02:42:51.000000Z
Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs
MarkTechPost@AI 2025-10-18T02:42:51.000000Z
Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs
MarkTechPost@AI 2025-10-18T02:42:51.000000Z
Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs
MarkTechPost@AI 2025-10-18T02:42:51.000000Z
Meta花了420万美元、烧掉40万GPU·小时,只为验证一条Sigmoid曲线
PaperWeekly 2025-10-17T16:47:03.000000Z
Meta花了420万美元、烧掉40万GPU·小时,只为验证一条Sigmoid曲线
PaperWeekly 2025-10-17T16:47:03.000000Z