无限重复定价游戏中的算法性串谋与相对表现学习

cs.AI updates on arXiv.org 10月07日

本文研究了无限重复的通用求和定价游戏中，独立强化学习者的串谋行为及其与相对表现学习的关系，发现相对表现对长期结果具有关键作用，并提出了缓解算法性串谋和过拟合问题的方法。

arXiv:2102.09139v3 Announce Type: replace-cross Abstract: In an infinitely repeated general-sum pricing game, independent reinforcement learners may exhibit collusive behavior without any communication, raising concerns about algorithmic collusion. To better understand the learning dynamics, we incorporate agents' relative performance (RP) among competitors using experience replay (ER) techniques. Experimental results indicate that RP considerations play a critical role in long-run outcomes. Agents that are averse to underperformance converge to the Bertrand-Nash equilibrium, while those more tolerant of underperformance tend to charge supra-competitive prices. This finding also helps mitigate the overfitting issue in independent Q-learning. Additionally, the impact of relative ER varies with the number of agents and the choice of algorithms.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签