少点错误 前天 06:27
警惕过度自信:从密码学到AI伦理的经验教训
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

作者以自身在密码学研究部门的经历为例,阐述了过度自信在技术领域可能带来的风险。他曾轻松破解一个内部设计的伪随机数生成器,从而体会到“不要自己发明加密算法”的重要性。文章对比了密码学与AI伦理领域的差异,指出AI伦理领域难以通过“干净的攻击”来证明观点错误,且缺乏公认的标准库。作者因此对AI安全感到悲观,并呼吁在“元伦理学”领域也应警惕过度自信,避免将未经充分检验的想法应用于高风险的系统。

🔒 **过度自信的风险与教训**:作者以亲身经历说明,即使是专业人士也可能过度自信于自己的想法,尤其是在缺乏经验的领域。在密码学领域,一个内部设计的伪随机数生成器被轻松破解,这强化了“不应自行发明加密算法”的普遍共识,强调了对成熟、经受检验的算法的依赖。

🤔 **AI伦理与密码学的不同挑战**:与密码学可以通过“干净的攻击”来证明算法的脆弱性不同,AI伦理领域的观点往往难以被直接证伪。作者指出,在AI安全领域,缺乏像密码学那样公认的标准库,使得建立共识和说服他人变得更加困难,这加剧了作者对AI安全前景的担忧。

⚖️ **警惕“元伦理学”中的过度自信**:作者将密码学的经验类比到“元伦理学”,呼吁人们对自己的哲学观点保持审慎,特别是当这些观点可能被应用于高风险决策或系统构建时。他建议,在缺乏确凿证据或广泛共识的情况下,应降低对自身观点的绝对自信,并鼓励开放的态度以接受他人的观点和反驳。

Published on November 12, 2025 10:17 PM GMT

One day, when I was an intern at the cryptography research department of a large software company, my boss handed me an assignment to break a pseudorandom number generator passed to us for review. Someone in another department invented it and planned to use it in their product, and wanted us to take a look first. This person must have had a lot of political clout or was especially confident in himself, because he refused the standard advice that anything an amateur comes up with is very likely to be insecure and he should instead use one of the established, off the shelf cryptographic algorithms, that have survived extensive cryptanalysis (code breaking) attempts.

My boss thought he had to demonstrate the insecurity of the PRNG by coming up with a practical attack (i.e., a way to predict its future output based only on its past output, without knowing the secret key/seed). There were three permanent full time professional cryptographers working in the research department, but none of them specialized in cryptanalysis of symmetric cryptography (which covers such PRNGs) so it might have taken them some time to figure out an attack. My time was obviously less valuable and my boss probably thought I could benefit from the experience, so I got the assignment.

Up to that point I had no interest, knowledge, or experience with symmetric cryptanalysis either, but was still able to quickly demonstrate a clean attack on the proposed PRNG, which succeeded in convincing the proposer to give up and use an established algorithm. Experiences like this are so common, that everyone in cryptography quickly learns how easy it is to be overconfident about one's own ideas, and many viscerally know the feeling of one's brain betraying them with unjustified confidence. As a result, "don't roll your own crypto" is deeply ingrained in the culture and in people's minds.

If only it was so easy to establish something like this in "applied philosophy" fields, like AI alignment! Alas, unlike in cryptography, it's rarely possible to come up with "clean attacks" that clearly show that a philosophical idea is wrong or broken. The most that can usually be hoped for is to demonstrate some kind of implication that is counterintuitive or contradicts other popular ideas. But due to "one man's modus ponens is another man's modus tollens", if someone is sufficiently willing to bite bullets, then it's impossible to directly convince them that they're wrong (or should be less confident) this way. This is made even harder because, unlike in cryptography, there are no universally accepted "standard libraries" of philosophy to fall back on. (My actual experiences attempting this, and almost always failing, are another reason why I'm so pessimistic about AI x-safety, even compared to most other x-risk concerned people.)

So I think I have to try something more meta, like drawing the above parallel with how easy is it to be overconfident in other fields, such as cryptography. Another meta line of argument is to consider how many people have strongly held, but mutually incompatible philosophical positions. Behind a veil of ignorance, wouldn't you want everyone to be less confident in their own ideas? Or think "This isn't likely to be a subjective question like morality/values might be, and what are the chances that I'm right and they're all wrong? If I'm truly right why can't I convince most others of this? Is there a reason or evidence that I'm much more rational or philosophically competent than they are?"

Unfortunately I'm pretty unsure any of these meta arguments will work either. If they do change anyone's minds, please let me know in the comments or privately. Or if anyone has better ideas for how to spread a meme of "don't roll your own metaethics"[1], please contribute. And of course counterarguments are welcome too, e.g., if people rolling their own metaethics is actually good, in a way that I'm overlooking.

  1. ^

    To preempt a possible misunderstanding, I don't mean "don't try to think up new metaethical ideas", but instead "don't be so confident in your ideas that you'd be willing to deploy them in a highly consequential way, or build highly consequential systems that depend on them in a crucial way". Similarly "don't roll your own crypto" doesn't mean never try to invent new cryptography, but rather don't deploy it unless there has been extensive review, and consensus that it is likely to be secure.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

过度自信 密码学 AI伦理 AI安全 元伦理学 Overconfidence Cryptography AI Ethics AI Safety Metaethics
相关文章