ScaleRL_Fishai

热点

"ScaleRL" 相关文章

How to scale RL

Interconnects 2025-10-20T16:39:57.000000Z

Meta用40万个GPU小时做了一个实验，只为弄清强化学习Scaling Law

机器之心 2025-10-20T14:24:54.000000Z

Meta用40万个GPU小时做了一个实验，只为弄清强化学习Scaling Law

机器之心 2025-10-20T14:24:54.000000Z

Meta花了420万美元、烧掉40万GPU·小时，只为验证一条Sigmoid曲线

PaperWeekly 2025-10-19T08:34:29.000000Z

Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs

MarkTechPost@AI 2025-10-18T02:42:51.000000Z

Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs

MarkTechPost@AI 2025-10-18T02:42:51.000000Z

Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs

MarkTechPost@AI 2025-10-18T02:42:51.000000Z

Sigmoidal Scaling Curves Make Reinforcement Learning RL Post-Training Predictable for LLMs

MarkTechPost@AI 2025-10-18T02:42:51.000000Z

Meta花了420万美元、烧掉40万GPU·小时，只为验证一条Sigmoid曲线

PaperWeekly 2025-10-17T16:47:03.000000Z

Meta花了420万美元、烧掉40万GPU·小时，只为验证一条Sigmoid曲线

PaperWeekly 2025-10-17T16:47:03.000000Z

Copyright © 2019 FISHAI.All Rights Reserved