cs.AI updates on arXiv.org 10月08日
决策Transformer:强化学习与Transformer模型的结合
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出了一种结合强化学习与Transformer模型的决策Transformer(DT)算法,通过优化动作梯度(AG)技术,解决了轨迹拼接和动作外推问题,显著提升了DT算法的性能。

arXiv:2510.05285v1 Announce Type: cross Abstract: Decision Transformer (DT), which integrates reinforcement learning (RL) with the transformer model, introduces a novel approach to offline RL. Unlike classical algorithms that take maximizing cumulative discounted rewards as objective, DT instead maximizes the likelihood of actions. This paradigm shift, however, presents two key challenges: stitching trajectories and extrapolation of action. Existing methods, such as substituting specific tokens with predictive values and integrating the Policy Gradient (PG) method, address these challenges individually but fail to improve performance stably when combined due to inherent instability. To address this, we propose Action Gradient (AG), an innovative methodology that directly adjusts actions to fulfill a function analogous to that of PG, while also facilitating efficient integration with token prediction techniques. AG utilizes the gradient of the Q-value with respect to the action to optimize the action. The empirical results demonstrate that our method can significantly enhance the performance of DT-based algorithms, with some results achieving state-of-the-art levels.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

决策Transformer 强化学习 Transformer模型 动作梯度 性能提升
相关文章