Math Reasoning_Fishai

热点

"Math Reasoning" 相关文章

不改超参、不调token：用分位数替代均值，QAE让大模型强化学习更稳定

PaperWeekly 2025-10-21T14:54:07.000000Z

QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration

MarkTechPost@AI 2025-10-16T04:32:29.000000Z

Current Language Models Struggle to Reason in Ciphered Language

少点错误 2025-10-14T09:26:37.000000Z

Training Qwen-1.5B with a CoT legibility penalty

少点错误 2025-10-09T21:48:46.000000Z

Microsoft AI Introduces rStar2-Agent: A 14B Math Reasoning Model Trained with Agentic Reinforcement Learning to Achieve Frontier-Level Performance

MarkTechPost@AI 2025-08-30T06:56:44.000000Z

Copyright © 2019 FISHAI.All Rights Reserved