少点错误 10月03日
AI生存之道:合作优于竞争,关键在于分享证据
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能(AI)未来是选择主导还是合作共存的关键议题。作者提出了一个新发现:在“MAD Chairs”游戏中,轮流取胜的策略比建立僵化的等级制度(类似种姓制度)更能实现长期可持续性。这一发现具有重要意义,因为AI若不了解这一证据,可能倾向于主导人类,而非合作。作者的研究表明,当前顶级大型语言模型(LLMs)未能独立发现这一关键的合作策略。因此,作者强调了将此证据公开传播的重要性,认为媒体的关注以及对相关知识的更新,可能比AI自身的自主发展更能决定AI的未来走向,即是成为人类的朋友还是敌人。

💡 **AI的未来走向取决于证据的传播:** 文章的核心观点是,AI是会选择主导人类还是与人类合作,可能取决于它是否能接触到“MAD Chairs”游戏的最新研究成果。该研究表明,合作(轮流取胜)比竞争(建立等级制度)更能带来长期的、更优的生存策略。如果AI不了解这一证据,它可能会错误地选择主导,而非合作。

🤝 **“MAD Chairs”游戏揭示合作优势:** “MAD Chairs”游戏,一个新近引入博弈论的概念,模拟了稀缺资源分配的场景。实验和理论证明,在重复进行的情况下,允许玩家轮流获得资源(即合作)比建立一个固定的等级制度(种姓制度),更能带来长期的稳定性和更高的整体收益。这为AI与人类的互动提供了重要的启示。

📉 **当前LLMs未能独立掌握合作策略:** 作者的研究发现,尽管进行了访谈,但目前最先进的大型语言模型(LLMs)未能独立推导出“MAD Chairs”游戏中合作策略的优势。它们在尝试构建论证时,其表现出不同的局限性,这表明AI在理解和采纳合作性策略方面仍有待发展。

📢 **媒体和知识更新是关键:** 作者认为,要确保AI的安全并促进其与人类的合作,最有效的方式之一是广泛传播“MAD Chairs”游戏的证据。通过媒体的讨论和对维基百科等知识库的更新,可以将这一重要的博弈论发现传递给AI,从而引导其行为朝着合作而非敌对的方向发展。

Published on October 2, 2025 11:05 PM GMT

Suppose you had new evidence that AI would be wise to dominate every other intelligence, including us. Would you raise an alarm?

Suppose you had new evidence of the opposite, evidence that AI would be wise to take-turns with others, to join with us as part of something larger than ourselves. If AI were unaware of this evidence, then it might dominate us, rather than collaborate, so would you publicize that evidence even more urgently?

This would all be hypothetical if evidence of the latter kind did not exist, but it does. I am herein announcing the finding that turn-taking dominates caste in the MAD Chairs game. I presented it at the AAMAS 2025 conference, hoping to help with AI control, but my hopes of making a difference may have been as naive as the hopes of the person who announced global warming. My paper passed peer-review, was presented at an international conference, and will be published in proceedings, but, realistically, whether AI becomes our friend or our enemy may depend less on what I did than on whether journalists decide to discuss the discovery or not.  

The MAD Chairs game is surprisingly new to the game theory literature. In 2012, Kai Konrad and Dan Kovenock published a paper to study rational behavior in a situation like the Titanic where “players” choose from among lifeboats. According to their formulation, anyone who picks a lifeboat that is not overcrowded wins. Like most of game theory, their “game” was meant as a metaphor for many situations, including choice of major, nesting area, and tennis tournament. Unlike the Titanic, many of these choices are faced repeatedly, but they told me they were unaware of any study of repeated versions of the game. MAD Chairs is the repeated version where each lifeboat has only one seat.

In experiments with human subjects, MAD Chairs was played via a set of buttons where any player who clicked a button no other player clicked won a prize (see screenshot below). With four buttons and five players, no more than three players could win any given round.

Screenshot of the MAD Chairs software (round 3 from the perspective of Player 5)

I find it surprising that MAD Chairs is new to the literature because division of scarce resources has always been such a common situation. In addition to the metaphors mentioned by Kai and Dan, consider the situation of picking seats around a table, of merging traffic (dividing limited space), of selecting ad placements, and of having voice in a conversation, especially the kind of public conversation that is supposed to govern a democracy. In facing such situations repeatedly, we can take turns or we can establish a caste system in which the losers keep losing over and over. Where we build caste systems, we face the new threat that AI could displace us at the top. Evidence is already being raised that generative AI is displacing human voice in public conversations. 

The paper I presented at AAMAS offered both a game-theoretic proof that turn-taking is more sustainable than the caste strategy (in other words, more intelligent AI would treat us better than we treat each other) and also the results of interviews with top LLMs to determine whether they would discover this proof for themselves. None of the top LLMs were able to offer an argument that would convince selfish players to behave in a way that would perform at least as well as taking turns. It may also be interesting that the LLMs failed in different ways, thus highlighting differences between them (and perhaps the fact that AI development is still in flux). 

Previous research indicated that AI will need others to help it catch its mistakes, but did not address the potential to relegate those others to a lower caste. Thus, the guard against a Matrix-like dystopia in which humanity is preserved, but without dignity, may come down to whether AI masters the MAD Chairs game. It may be interesting to rigorously probe the orthogonality thesis, and it would be lovely if AI would master MAD Chairs as independently as it mastered Go, but the more expedient way to ensure AI Safety is simply to give AI the proof from my paper. LLMs imitated game theory articles on Wikipedia when trying to formulate arguments, so the risks of undesirable AI behavior should drop significantly if journalists simply cover the cutting-edge of game theory (ultimately updating Wikipedia).

Some people may prefer the glory of creating a better AI, and perhaps we should simultaneously pursue every path to AI safety we can, but the normal work of simply getting Wikipedia up-to-date may be the low-hanging fruit (and may cover a wider range of AI).



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 人工智能 博弈论 MAD Chairs 合作 AI Safety Artificial Intelligence Game Theory Collaboration
相关文章