cs.AI updates on arXiv.org 10月21日 12:20
EvolveR:自我改进的智能体框架
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出EvolveR框架,使智能体通过闭环经验生命周期自我改进,包括离线自蒸馏和在线交互两个阶段,实现问题解决策略的迭代优化,在多跳问答基准测试中表现优异。

arXiv:2510.16079v1 Announce Type: cross Abstract: Current Large Language Model (LLM) agents show strong performance in tool use, but lack the crucial capability to systematically learn from their own experiences. While existing frameworks mainly focus on mitigating external knowledge gaps, they fail to address a more fundamental limitation: the inability to iteratively refine problem-solving strategies. In this work, we introduce EvolveR, a framework designed to enable agent to self-improve through a complete, closed-loop experience lifecycle. This lifecycle comprises two key stages: (1) Offline Self-Distillation, where the agent's interaction trajectories are synthesized into a structured repository of abstract, reusable strategic principles; (2) Online Interaction, where the agent interacts with tasks and actively retrieves distilled principles to guide its decision-making, accumulating a diverse set of behavioral trajectories. This loop employs a policy reinforcement mechanism to iteratively update the agent based on its performance. We demonstrate the effectiveness of EvolveR on complex multi-hop question-answering benchmarks, where it achieves superior performance over strong agentic baselines. Our work presents a comprehensive blueprint for agents that learn not only from external data but also from the consequences of their own actions, paving the way for more autonomous and continuously improving systems. Code is available at https://github.com/Edaizi/EvolveR.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

智能体 自我改进 闭环经验 问题解决 多跳问答
相关文章
如何找到问题的根源?找到问题后又该如何获取答案?为什么知道了问题,却仍然难以解决? 在回顾《原则》里的摘录时,其他几本书里与「问题解决」有关的笔记也浮...
近期学到的一个技能: 相信别人已经做过。 很多问题的解决方案,这个世界上已经存在过。一定有这个世界上某个团队某个人已经思考的非常透彻非常成熟,可能在书籍...
Comment on Breaking – CoinMex to Launch Its Own Token for Exchange and Become First Crypto Exchange to Support ONG Introducing CT by Новосибирск медициналық университетінің оқу ақысы 2023
客观来说,这世界上没有一个人会贫穷,如果你真的穷,你做好一件事情,我一年给到你100万,如果你拿不到钱,我给你。 什么事情呢?一年365天,你每天去锻炼、减...
给男孩子的建议 1、普通男孩,翻盘的第一张牌,必须去更好的圈子,去认识更高层次的人。而更好的圈子,只能是工作、学校、或者由家境带来。至于什么酒吧、高档酒...
猴子心态VS僧人心态 Monkey Mind:被细节困扰 过度思考 抱怨比较批评 短期满足 苛刻专权 自我中心执迷不悟 被愤怒等情绪掌控 寻求应急之道 Monk Mind: 专注问题...
不好说什么,如果自己都没有什么变好的欲望的话,那就只能无病呻吟了
昆仑万维与北京联通达成战略合作
量贩零食不赚钱,老牌巨头被震伤,到底谁赢了?
百度何俊杰:大模型不应该只向内卷算力、卷参数,更应该向外卷场景、卷问题