AI推理代理的计算能力与学习机制

cs.AI updates on arXiv.org 前天 12:35

AI推理代理的计算能力与学习机制

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

该研究探讨了AI推理代理在运用工具、模拟假设和反思过程中所执行的计算，并提出AI代理是否能实现通用计算。文章将AI代理重新诠释为具备计算能力的随机动力系统，强调时间在学习推理中的基础性作用。研究提出从归纳学习转向 the transductive learning，目标不再是拟合过去数据分布，而是捕捉数据的算法结构以缩短解决新任务的时间。研究表明，信息在学习中的关键作用在于减少时间而非重建误差，并理论推导了推理时间与训练时间之间的幂律缩放关系。最后，文章指出，增大模型规模虽能提高基准测试准确性，但可能导致“博学型”行为，而非真正的智能，并强调应优化时间而非仅仅模型规模。

🧠 AI推理代理具备计算能力：研究将AI推理代理的行为视为一种计算过程，即使其并非经典意义上的程序执行，并以此为基础探讨其通用计算的可能性，即Chain-of-Thought推理是否能解决所有可计算任务。

⏳ 时间在学习中的核心作用：文章提出了一种新的学习范式——transductive learning，认为学习的关键在于捕捉数据的算法结构以减少解决新任务所需的时间，而非仅仅拟合历史数据。信息在学习中的作用被重新定义为时间缩减而非重建误差。

📈 理论推导与幂律缩放：基于算法信息论，研究理论推导了通用求解器利用过去数据所能达到的最优加速效果，并揭示了推理时间与训练时间之间观察到的幂律缩放关系的由来。

🤖 模型规模的局限性与智能的定义：研究指出，单纯扩大模型规模可能导致AI在特定任务上表现出色，但缺乏真正的智能，可能成为“博学型”的工具，无法应对复杂挑战。因此，在扩展推理模型时，应优先考虑优化时间这一关键因素。

arXiv:2510.12066v1 Announce Type: new Abstract: AI reasoning agents are already able to solve a variety of tasks by deploying tools, simulating outcomes of multiple hypotheses and reflecting on them. In doing so, they perform computation, although not in the classical sense -- there is no program being executed. Still, if they perform computation, can AI agents be universal? Can chain-of-thought reasoning solve any computable task? How does an AI Agent learn to reason? Is it a matter of model size? Or training dataset size? In this work, we reinterpret the role of learning in the context of AI Agents, viewing them as compute-capable stochastic dynamical systems, and highlight the role of time in a foundational principle for learning to reason. In doing so, we propose a shift from classical inductive learning to transductive learning -- where the objective is not to approximate the distribution of past data, but to capture their algorithmic structure to reduce the time needed to find solutions to new tasks. Transductive learning suggests that, counter to Shannon's theory, a key role of information in learning is about reduction of time rather than reconstruction error. In particular, we show that the optimal speed-up that a universal solver can achieve using past data is tightly related to their algorithmic information. Using this, we show a theoretical derivation for the observed power-law scaling of inference time versus training time. We then show that scaling model size can lead to behaviors that, while improving accuracy on benchmarks, fail any reasonable test of intelligence, let alone super-intelligence: In the limit of infinite space and time, large models can behave as savants, able to brute-force through any task without any insight. Instead, we argue that the key quantity to optimize when scaling reasoning models is time, whose critical role in learning has so far only been indirectly considered.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签