热点
"PutnamGAP数据集" 相关文章
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
cs.AI updates on arXiv.org 2025-08-13T04:14:51.000000Z