热点
关于我们
xx
xx
"
代码评估
" 相关文章
Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards
cs.AI updates on arXiv.org
2025-10-28T04:05:00.000000Z
AutoCode: A New AI Framework that Lets LLMs Create and Verify Competitive Programming Problems, Mirroring the Workflow of Human Problem Setters
MarkTechPost@AI
2025-10-18T09:11:05.000000Z
McMining: Automated Discovery of Misconceptions in Student Code
cs.AI updates on arXiv.org
2025-10-13T04:13:27.000000Z
ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
cs.AI updates on arXiv.org
2025-07-18T04:13:54.000000Z
AGACCI : Affiliated Grading Agents for Criteria-Centric Interface in Educational Coding Contexts
cs.AI updates on arXiv.org
2025-07-09T04:01:39.000000Z
首次覆盖超 11 类真实编程场景!豆包大模型团队开源代码大模型全新基准
字节跳动技术团队
2024-12-07T10:45:19.000000Z
首次覆盖超 11 类真实编程场景!豆包大模型团队开源代码大模型全新基准
豆包MarsCode
2024-12-06T11:44:28.000000Z
CodeJudge: An Machine Learning Framework that Leverages LLMs to Evaluate Code Generation Without the Need for Test Cases
MarkTechPost@AI
2024-10-17T12:03:46.000000Z
左脚踩右脚上天!OpenAI全新模型让GPT-4训练GPT-4
快科技资讯
2024-06-28T09:05:23.000000Z