LLMs在数学问题解决中的元推理挑战

cs.AI updates on arXiv.org 09月03日

LLMs在数学问题解决中的元推理挑战

本文探讨了大型语言模型在解决数学问题时，特别是在元推理任务上如识别学生解题错误步骤的挑战。通过使用VtG和PRM800K两个错误推理数据集，发现最先进的LLMs难以定位学生解题中的首次错误步骤。本文提出了一种生成中间修正学生解题方案的方法，有助于提升模型性能。

arXiv:2509.01395v1 Announce Type: cross Abstract: Large language models (LLMs) demonstrate remarkable performance on math word problems, yet they have been shown to struggle with meta-reasoning tasks such as identifying errors in student solutions. In this work, we investigate the challenge of locating the first error step in stepwise solutions using two error reasoning datasets: VtG and PRM800K. Our experiments show that state-of-the-art LLMs struggle to locate the first error step in student solutions even when given access to the reference solution. To that end, we propose an approach that generates an intermediate corrected student solution, aligning more closely with the original student's solution, which helps improve performance.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签