热点
"PokeeResearch-7B" 相关文章
PokeeResearch-7B: An Open 7B Deep-Research Agent Trained with Reinforcement Learning from AI Feedback (RLAIF) and a Robust Reasoning Scaffold
MarkTechPost@AI 2025-10-23T03:08:08.000000Z
PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold
cs.AI updates on arXiv.org 2025-10-20T04:10:13.000000Z