少点错误 09月17日
对AI风险书籍《如果有人建造它,所有人都会死》的评论
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文对书籍《如果有人建造它,所有人都会死》进行了评论。作者认为该书的前两部分是向大众解释AI错位风险的优秀读物,但第三部分存在不足,甚至建议跳过。尽管作者不完全认同书中的论点和建议,但他仍推荐这本书给非专业读者,并希望更多人阅读。评论详细分析了书中各部分的内容,包括对神经网络的解释、AI接管的故事以及对现有应对措施的讨论,并提出了自己的观点和异议,特别是在AI发展的时间线和风险评估方面。

📚 该书的前两部分被认为是向普通读者解释AI错位风险的优秀材料,清晰地阐述了神经网络的基本原理以及为何强大的AI可能与人类目标不一致。作者特别赞赏书中对进化类比的探讨,认为其清晰且富有洞察力,尽管也指出了该类比存在的局限性未被充分讨论。

💡 书籍的第三部分,关于AI风险的应对措施,被认为质量参差不齐,甚至有些部分“相当糟糕”,作者建议读者可以跳过。书中对缓解AI风险的一些提议,如自动化AI对齐研究,其论证未能回应最深入的版本,并且对GPU集群的限制等提议的有效性也受到质疑。

🤔 作者对书籍核心论点,特别是从“有人建造AI,大家都会死”的限定版本推导出无限定版本的方式存在重大异议。作者认为,这种推导依赖于一些值得商榷的假设,并且忽略了AI发展过程中可能出现的关键变化和干预机会,未能充分探讨AI发展轨迹与最终结果之间的复杂关系。

👍 尽管存在批评,作者仍“欣然推荐”这本书给非专业人士,认为它是解释AI、超智能以及错位风险基本论证方面“半途而止”的优秀资源。作者希望更多人阅读这本书,即使部分内容有待商榷,也能促进对AI风险的严肃讨论。

Published on September 17, 2025 4:34 AM GMT

I listened to "If Anyone Builds It, Everyone Dies" today.

I think the first two parts of the book are the best available explanation of the basic case for AI misalignment risk for a general audience. I thought the last part was pretty bad, and probably recommend skipping it. Even though the authors fail to address counterarguments that I think are crucial, and as a result I am not persuaded of the book’s thesis and think the book neglects to discuss crucial aspects of the situation and makes poor recommendations, I would happily recommend the book to a lay audience and I hope that more people read it.

I can't give an overall assessment of how well this book will achieve its goals. The point of the book is to be well-received by people who don't know much about AI, and I’m not very good at predicting how laypeople will respond to it; seems like results so far are mixed leaning positive. So I’ll just talk about whether I think the arguments in the book are reasonable enough that I want them to be persuasive to the target audience, rather than whether I think they’ll actually succeed.

Thanks to several people for helpful and quick comments and discussion, especially Oli Habryka and Malo Bourgon!

Synopsis

Here's a synopsis and some brief thoughts, part-by-part:

I personally (unlike e.g. Shakeel) really liked the writing throughout. I'm a huge fan of Eliezer's fiction and most of his non-fiction that doesn't talk about AI, so maybe this is unsurprising. I often find it annoying to read things Eliezer and Nate write about AI, but I genuinely enjoyed the experience of listening to the book. (Also, the narrator for the audiobook does a hilarious job of rendering the dialogues and parables.)

My big disagreement

In the text, the authors often state a caveated version of the title, something like "If anyone builds it (with techniques like those available today), everyone dies". But they also frequently state or imply the uncaveated title. I'm quite sympathetic to something like the caveated version of the title[2]. But I have a huge problem with equivocating between the caveated and uncaveated versions.

There are two possible argument structures that I think you can use to go from the caveated thesis to the uncaveated one, and both rely on steps that are IMO dubious:

Argument structure one:

This is the argument that I (perhaps foolishly and incorrectly) understood Eliezer and Nate to be making when I worked with them, and the argument I made when I discussed AI x-risk five years ago, right before I started changing my mind on takeoff speeds.

I think Eliezer and Nate aren’t trying to make this argument—they are agnostic on timelines and they don’t want to argue that sub-ASI AI will be very unimportant for the world. I think they are using what I’ll call “argument structure two”:

The authors are (unlike me) confident in tricky hypothesis 2. The book says almost nothing about either the big complication or tricky hypothesis 2, and I think that’s a big hole in their argument that a better book would have addressed.[3] ( I find Eliezer’s arguments extremely uncompelling.)

I think that explicitly mentioning the big complication is pretty important for giving your audience an accurate picture of what you're expecting. Whenever I try to picture the development of ASI, it's really salient in my picture that that world already has much more powerful AI than today’s, and the AI researchers will be much more used to seeing their AIs take unintended actions that have noticeably bad consequences. Even aside from the question of whether it changes the bottom line, it’s a salient-enough part of the picture that it feels weird to neglect discussing it.

And of course, the core disagreement that leads me to disagree so much with Eliezer and Nate on both P(AI takeover) and on what we should do to reduce it: I don't agree with tricky hypothesis 2. I think that the trajectory between here and ASI gives a bunch of opportunities for mitigating risk, and most of our effort should be focused on exploiting those opportunities. If you want to read about this, you could check out the back-and-forth me and my coworkers had with some MIRI people here, or the back-and-forth Scott Alexander and Eliezer had here.

(This is less relevant given the authors’ goal for this book, but from my perspective, another downside of not discussing tricky hypothesis 2 is that, aside from being relevant to estimating P(AI takeover), understanding the details of these arguments is crucial if you want to make progress on mitigating these risks.)

If they wanted to argue a weaker claim, I'd be entirely on board. For example, I’d totally get behind:

But instead, they propose a much stronger thesis that they IMO fail to justify.

This disagreement leads to my disagreement with their recommendations—relatively incremental interventions seem much more promising to me.

(There’s supplementary content online. I only read some of this content, but it seemed somewhat lower quality than the book itself. I'm not sure how much of that is because the supplementary content is actually worse, and how much of it is because the supplementary content gets more into the details of things—I think that the authors and MIRI staff are very good at making simple conceptual arguments clearly, and are weaker when arguments require attention to detail.)

(I will also parenthetically remark that superintelligence is less central in my picture than theirs. I think that there is substantial risk posed by AIs that are not wildly superintelligent, and it's plausible that humans purposefully or involuntarily cede control to AIs that are less powerful than the wildly superintelligent ones the authors describe in this book. This causes me to disagree in a bunch of places.)

I tentatively support this book

I would like it if more people read this book, I think. The main downsides are:

Despite my complaints, I’m happy to recommend the book, especially with the caveat that I think it's wrong about a bunch of stuff. Even given all the flaws, I don't know of a resource for laypeople that’s half as good at explaining what AI is, describing superintelligence, and making the basic case for misalignment risk. After reading the book, it feels like a shocking oversight that no one wrote it earlier.

  1. ^

     In their story, the company figures out a way to scale the AI in parallel, and then the company suddenly massively increases the parallel scale and the AI starts plotting against them. This seems somewhat implausible—probably the parallel scale would be increased gradually, just for practical reasons. But if that scaling had happened more gradually, the situation probably still wouldn't have gone that well for humanity if the AI company was as incautious as I expect, so whatever. (My objection here is different from what Scott complained about and Eliezer responded to here—I’m not saying it’s hugely unrealistic for parallel scaling to pretty suddenly lead to capabilities improving as rapidly as depicted in the book, I’m saying that if such a parallel scaling technique was developed, it would probably be tested out with incrementally increasing amounts of parallelism, if nothing else just for practical engineering reasons.)

  2. ^

     My main problem with the caveated version of the title is again that I think they’re inappropriately reasoning about what happens for arbitrarily intelligent models instead of reasoning about what happens with AIs that are just barely capable enough to count as ASI. Their arguments (that AIs will learn goals that are egregiously misaligned with human goals and then conspire against us) are much stronger for wildly galaxy-brained AIs than for AIs that are barely smart enough to count as superhuman.

  3. ^

     I don't think Eliezer and Nate are capable of writing this better book, because I think their opinions on this topic are pretty poorly thought through.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI风险 人工智能安全 AI对齐 书籍评论 If Anyone Builds It, Everyone Dies AI Misalignment AI Safety Book Review
相关文章