cs.AI updates on arXiv.org 09月03日
LLMs在数学问题解决中的元推理挑战
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了大型语言模型在解决数学问题时,特别是在元推理任务上如识别学生解题错误步骤的挑战。通过使用VtG和PRM800K两个错误推理数据集,发现最先进的LLMs难以定位学生解题中的首次错误步骤。本文提出了一种生成中间修正学生解题方案的方法,有助于提升模型性能。

arXiv:2509.01395v1 Announce Type: cross Abstract: Large language models (LLMs) demonstrate remarkable performance on math word problems, yet they have been shown to struggle with meta-reasoning tasks such as identifying errors in student solutions. In this work, we investigate the challenge of locating the first error step in stepwise solutions using two error reasoning datasets: VtG and PRM800K. Our experiments show that state-of-the-art LLMs struggle to locate the first error step in student solutions even when given access to the reference solution. To that end, we propose an approach that generates an intermediate corrected student solution, aligning more closely with the original student's solution, which helps improve performance.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLMs 数学问题解决 元推理 错误识别 模型性能
相关文章