少点错误 07月22日
A distillation of Ajeya Cotra and Arvind Narayanan on the speed of AI progress
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文聚焦于AI发展速度和对社会影响的讨论,重点梳理了Arvind Narayanan的观点。他认为AI的影响将是渐进且持续的,需要数十年而非几年,且多数实际应用需依赖真实世界数据和测试。这种渐进性使得社会有时间调整并建立防御机制。Arvind对暂停AI发展或限制开源模型持谨慎态度,认为其弊大于利。文章还探讨了AI能力迁移、成本效益、安全风险以及AI在公司运营中的作用等议题,并提出了一些可能改变Arvind观点的实验性设想,如AI能够快速掌握新技能且无需大量特定数据。

⏳ **AI影响的渐进性与持续性**:Arvind Narayanan的核心观点是,AI对社会的影响将是渐进且连续的,其发展和应用将贯穿数十年,而非在几年内发生剧烈变化。他认为,大多数AI的实际应用,如规划婚礼等复杂任务,都需要海量的真实世界数据和大量的实地测试才能实现,这本身就限制了AI发展的速度。因此,社会将有足够的时间来适应、调整并开发相应的安全措施和干预手段,从而降低潜在风险。

🔬 **关键能力验证与观点修正**:文章探讨了可能动摇Arvind观点的几种情景。例如,如果AI能够实现从游戏或模拟环境到开放式任务的显著迁移学习,或者能够构建高度真实的模拟环境,那么这将对“速度限制论”提出挑战。此外,如果一项AI技术能在不依赖大量领域特定数据或真实世界测试的情况下,快速达到生产就绪状态,例如一个可靠且成本可控的通用个人助手,这将显著改变他对AI发展速度的评估。

🛡️ **安全防护与模型扩散的策略**:Arvind认为,AI的安全性应建立在持续发展和扩散的基础上,而非依赖于阻止模型逃逸或自我复制的难度。他强调,过于脆弱的安全措施反而可能导致灾难性的失败。通过允许弱模型广泛传播,社会可以逐步建立起应对更强模型的“免疫系统”。他反对轻易暂停AI研发或限制开源模型,除非有确凿证据表明这些措施能有效减缓风险,并且其成本效益经过深思熟虑。

🏢 **AI在企业运营中的角色与影响**:对于AI是否能胜任CEO等企业管理职位,Arvind认为这并非一个简单的技术能力问题,而涉及相对位置和市场优势。他预测,AI将在公司运营中扮演越来越重要的角色,甚至可能成为决策核心,但这将是一个长期的、渐进的过程,而非突发的颠覆。他认为,过早担忧100%由AI运营的公司是不现实的,关键在于理解AI如何逐步改变“经营公司”的含义,并为社会带来经济上的深刻变革。

Published on July 22, 2025 2:59 PM GMT

Introduction

To help improve my own world models around AI, I am trying to understand and distill different worldviews. One worldview I am trying to understand is ‘AI as a normal technology’, by Arvind Narayanan and Sayash Kapoor. As a stepping stone to distilling that 15,000 word beast, I am first distilling a follow-up discussion between Ajeya Cotra and Arvind Narayanan, on the particular question about how quickly AI will progress and diffuse through society.

I found it surprisingly difficult to compress the key points of the discussion, so I have structured this post as follows:

A central reason I have struggled to distill things is that (I think) Arvind follows contextualizing norms, whereas I am trying to compress things into decoupled statements (in the sense of Decoupling vs Contextualizing Norms).  For this reason, I quote significant portions of what is said, because the precise phrasing matters to accurately represent Arvind’s views.

Summary of Arvind’s key beliefs

This is my attempt at isolating Arvind’s key beliefs from the discussion. As stated in the introduction, I recommend reading/skimming the rest of the article for more nuance and context.

(I actually believe the biggest crux is that Arvind thinks super-intelligence is not a meaningful concept and there is not significant room above human intelligence in most real-world tasks, e.g., forecasting or persuasion. However, this is not relevant for the discussion between Arvind and Ajeya. I plan to describe this more in a distillation of AI as a normal technology.)

What would change Arvind’s mind?

A large fraction of the discussion is Ajeya brainstorming observations or experiments that would separate their world views. They basically boil down to things developing quickly, without needing a lot of domain specific data or real-world testing.

Generalizing from games or simulations to open-ended tasks

Ajeya: The current assumption is that, to be production-ready for applications like travel booking, you need significant training on that specific application. But what if continued AI capability development leads to better transfer learning from domains where data is easily collected to domains where it’s harder to collect — for instance, if meta-learning becomes highly effective with most training done on games, code, and other simulated domains?

Arvind: If we can achieve significant generalization from games to more open-ended tasks, or if we can build truly convincing simulations, that would be very much a point against the speed limit thesis.

A reliable agent, but which had high inference costs

Ajeya: I'm curious if your view would change if a small engineering team could create an agent with the reliability needed for something like shopping or planning a wedding, but it's not commercially viable because it's expensive and takes too long on individual actions, needing to triple-check everything.

Arvind: That would be super convincing. I don't think cost barriers will remain significant for long.

An RCT that showed AI can guide high schoolers to make smallpox

Ajeya: Let's say we did an RCT and discovered that random high school students could be talked through exactly how to make smallpox, with AI helping order DNA fragments and bypass KYC monitoring. If that happened in 2025, would you support pausing AI development until we could harden those systems and verify the AI now fails at this task?

[...]

Arvind: this would have to be in a world where open models aren't competitive with the frontier, right? Because otherwise it wouldn’t matter. But yes, if those preconditions hold — if we think pausing would actually affect attackers' access to these AI capabilities, and if the RCT evidence is sufficiently compelling — then I could see some version of a pause being warranted.

Broad adoption of AI within AI companies

Ajeya: Let's say we magically had deep transparency into AI companies and how they're using their systems internally, and we start seeing AI systems rapidly being given deference in really broad domains, reaching team lead level, handling procurement decisions, moving around significant money. Would that change your view on how suddenly the impacts might hit the rest of the world?

Arvind: That would be really strong evidence that would substantially change my views on a lot of what we’ve talked about.

Companies prioritizing accelerating AI R&D more than end-user products

Ajeya: What I worry about is a world where companies are directing AI development primarily toward accelerating AI R&D and hardware R&D. They try to make enough money to keep going, but won't bother creating a great personal assistant AI agent because it’s hard to do right now but would become much easier after this explosive capabilities progress is complete.

Arvind: That's a fair concern, though I personally think it's unlikely because I believe much of the learning has to happen through deployment. But I very much understand the concern, and I'd support transparency interventions that would let us know if this is happening.

 

Developing real-world capabilities mostly in a lab

Ajeya: Do you have particular experiments that would be informative about whether transfer can go pretty far, or whether you can avoid extensive real-world learning?

Arvind: The most convincing set of  experiments would involve developing any real-world capability purely (or mostly) in a lab — whether self-driving or wedding planning or drafting an effective legal complaint by talking to the client.

 

Creation of effective general purpose assistants in 2025 or 2026

Ajeya: if in 2025 or 2026 there was a fairly general-purpose personal assistant that worked out of the box — could send emails, book flights, and worked well enough that you'd want to use it — would that shift your thinking about how quickly this technology will impact the real world?

Arvind: That would definitely be a big shift. [...] we've learned about the capability-reliability gap, prompt injection, context issues, and cost barriers. If all those half-dozen barriers could be overcome in a one-to-two year period, even for one application, I'd want to deeply understand how they changed so quickly. It would significantly change my evaluation.

 

Learning a new language with a minimal phrasebook

Ajeya: I've been looking for good benchmarks or ecological measurements of meta-learning and sample-efficient learning — basically any reproducible measurement. But I've come up short because it's quite hard to confirm the model doesn't already know something and is actually learning it. Do you have any suggestions?

Arvind: I've seen some attempts in different domains. For instance, testing if a model can learn a new language given just a phrasebook, knowing the language wasn't in the training data. That would be pretty strong evidence if the model could learn it as well as if it had extensive training data.

 

Miscellaneous

On several occasions, Arvind presents some interesting arguments, but they often involve multiple inter-related sub-claims which cannot be neatly dis-entangled. I have just copied big chunks of the discussion, but have sub-headers that indicate the key discussion points.

Assume models can escape/self-reproduce, safety via continuous development and diffusion 

Arvind: I think we should assume every model will be capable of escape and self-reproduction. Safety shouldn't rely on that being difficult.

Ajeya: Do you think current models are capable of that, or is this just a conservative assumption we should be making?

Arvind: It's partly a conservative assumption, but it also relates to resilience versus fragility. I think many proposed safety interventions actually increase fragility. They try to make sure the world doesn’t get into some dangerous state, but they do it in such a way that if the measure ever fails, it will happen discontinuously rather than continuously, meaning we won't have built up an “immune system” against smaller versions of the problem. If you have weak models proliferating, you can develop defenses that scale gradually as models get stronger. But if the first time we face proliferation is with a super-strong model, that's a much tougher situation.

Ajeya: I think I see two implicit assumptions I'd want to examine here.

First, on the object level, you seem to believe that the defender-attacker balance will work out in favor of defense, at least if we iteratively build up defenses over time as we encounter stronger and stronger versions of the problem (using increasingly stronger AIs for better and better defense). One important reason I'm unsure about this assumption is that if AI systems are systematically misaligned and collectively extremely powerful, they may coordinate with one another to undermine human control, so we may not be able to straightforwardly rely on some AIs keeping other AIs in check.

Then, on the meta level, it also seems like you believe that if you’re wrong about this, there will be some clear warning sign before it’s too late. Is that right? 

Arvind: Yes, I have those assumptions. And if we don’t have an early warning sign, the most likely reason is that we weren’t doing enough of the right kinds of measurement.

Can AI’s be CEOs, premature to think about 100% AI-run companies, slow vs fast takeoff

(The opening paragraph below is a good example of how Arvind has contextual norms that make it hard for me to understand what their position even is. Either they are saying something obviously false, or, there is some difference in our background assumptions.)

Arvind: Many of these capabilities that get discussed — I'm not even convinced they're theoretically possible. Running a successful company is a classic example: the whole thing is about having an edge over others trying to run a company. If one copy of an AI is good at it, how can it have any advantage over everyone else trying to do the same thing? I'm unclear what we even mean by the capability to run a company successfully — it's not just about technical capability, it's about relative position in the world.

[…]

Ajeya: [Human CEOs] might not be competitive unless they defer high-level strategy to AI, such that humans are CEOs on paper but must let their AI make all decisions because every other company is doing the same. Is that a world you see us heading toward? I think I've seen you express skepticism earlier about reaching that level of deference.

Arvind: I think we're definitely headed for that world. I'm just not sure it's a safety risk. [...] For the foreseeable future, what it means to “run a company” will keep changing rapidly, just as it has with the internet. I don't see a discontinuity where AI suddenly becomes superhuman at running companies and brings unpredictable, cataclysmic impacts. As we offload more to AI, we'll see economically transformative effects and enter a drastically different world. To be clear, I think this will happen gradually over decades rather than a singular point in time. At that stage, we can think differently about AI safety. It feels premature to think about what happens when companies are completely AI-run.

Ajeya: I don't see it as premature because I think there's a good chance the transition to this world happens in a short few years without enough time for a robust policy response, and — because it’s happening within AI companies — people in the outside world may feel the change more suddenly. 

Arvind: Yup, this seems like a key point of disagreement! Slow takeoff is core to my thinking, as is the gap between capability and adoption —  no matter what happens inside AI companies, I predict that the impact on the rest of the world will be gradual.

Biorisk, intervening in whole bio supply chain, costs and benefits of restricting open models

Ajeya: Okay, let’s consider the example of biorisk. Let's say we did an RCT and discovered that random high school students could be talked through exactly how to make smallpox, with AI helping order DNA fragments and bypass KYC monitoring. If that happened in 2025, would you support pausing AI development until we could harden those systems and verify the AI now fails at this task?

Arvind: Well, my hope would be that we don't jump from our current state of complacency directly to that point. We should have testing in place to measure how close we're getting, so we can respond more gradually. While this is a low-confidence statement, I think the preferred policy response would focus on controlling the other bottlenecks that are more easily manageable — things like screening materials needed for synthesis and improving authentication/KYC — rather than pausing AI development, which seems like one of the least effective ways to mitigate this risk.

Ajeya: But let's say we're unlucky — the first time we do the RCT, we discover AIs are more powerful than we thought and can already help high schoolers make smallpox. Even if our ultimate solution is securing those supply chain holes you mentioned, what should we do about AI development in the meantime? Just continue as normal?

Arvind: Well, this would have to be in a world where open models aren't competitive with the frontier, right? Because otherwise it wouldn’t matter. But yes, if those preconditions hold — if we think pausing would actually affect attackers' access to these AI capabilities, and if the RCT evidence is sufficiently compelling — then I could see some version of a pause being warranted.

Ajeya:  So maybe in 2027 we will do this RCT, and we will get this result, and we will want to be able to stop models from proliferating. And then we might think — I wish that in 2025, we had done things to restrict open source models beyond a certain capability level. This particular question is very confusing to me because I think open source models have huge benefits in terms of people understanding where capability levels are in a way that AI companies can't gate or control and in letting us do a whole bunch of safety research on those models. But this is exactly the kind of thing I would like to hear you speak to — do you think it's valuable to give ourselves that lever? And how should we think about if or when to make that choice? 

Arvind: There's certainly some value in having that lever, but one key question is: what's the cost? On utilitarian grounds alone, I’m not sure it's justified to restrict open models now because of future risks. To justify that kind of preemptive action, we'd need much more evidence gathering. Do we know that the kind of purely cognitive assistance that models can provide is the bottleneck to the threats we’re worried about? And how do other defenses compare to restricting open models in terms of cost and effectiveness? But more saliently, I don't think a cost-benefit approach gives us the full picture. The asymmetry between freedom-reducing interventions and other interventions like funding more research is enormous. Governments would rapidly lose legitimacy if they attempt what many view as heavy-handed interventions to minimize speculative future risks with unquantified probabilities.

Continuous vs discontinuous development, how it’s partly our choice, how it impacts policy feasibility, frontier lab transparency as a form of continuity

Ajeya: I think the biggest difference between our worldviews is how quickly and with how little warning we think these risks might emerge. I want to ask — why do you think the progression will be continuous enough that we will get plenty of warning?

Arvind: Partly I do think the progression will be continuous by default. But partly I think that's a result of the choices we make — if we structure our research properly, we can make it continuous. And third, if we abandon the continuity hypothesis, I think we're in a very bad place regarding policy. We end up with an argument that's structurally similar — I'm not saying substantively similar — to saying “aliens might land here tomorrow without warning, so we should take costly preparatory measures.”

If those calling for intervention can't propose some continuous measure we can observe, something tied to the real world rather than abstract notions of capability, I feel that's making a policy argument that's a bridge too far. I need to think more about how to make this more concrete, but that's where my intuition is right now.

Ajeya: Here's one proposal for a concrete measurement — we probably wouldn't actually get this, but let's say we magically had deep transparency into AI companies and how they're using their systems internally. We're observing their internal uplift RCTs on productivity improvements for research engineers, sales reps, everyone. We're seeing logs and surveys about how AI systems are being used. And we start seeing AI systems rapidly being given deference in really broad domains, reaching team lead level, handling procurement decisions, moving around significant money. If we had that crystal ball into the AI companies and saw this level of adoption, would that change your view on how suddenly the impacts might hit the rest of the world?

Arvind: That would be really strong evidence that would substantially change my views on a lot of what we’ve talked about. But I'd say what you're proposing is actually a way to achieve continuity, and I strongly support it. This intervention, while it does reduce company freedom, is much weaker than stopping open source model proliferation. If we can't achieve this lighter intervention, why are we advocating for the stronger one?

 

Frontier lab strategy, one brilliant insight vs training on billions of data per task

Ajeya: I think the eventual smoothing [of jagged capabilities] might not be gradual — it might happen all at once because large AI companies see that as the grand prize. They're driving toward an AI system that's truly general and flexible, able to make novel scientific discoveries and invent new technologies — things you couldn't possibly train it on because humanity hasn't produced the data. I think that focus on the grand prize explains their relative lack of effort on products — they're putting in just enough to keep investors excited for the next round. It's not developing something from nothing in a bunker, but it's also not just incrementally improving products. They're doing minimum viable products while pursuing AGI and artificial superintelligence.

It's primarily about company motivation, but I can also see potential technical paths — and I'm sure they're exploring many more than I can see. It might involve building these currently unreliable agents, adding robust error checking, training them to notice and correct their own errors, and then using RL across as many domains as possible. They're hoping that lower-hanging fruit domains with lots of RL training will transfer well to harder domains — maybe 10 million reps on various video games means you only need 10,000 data points of long-horizon real-world data to be a lawyer or ML engineer instead of 10 million. That's what they seem to be attempting, and it seems like they could succeed.

Arvind: That's interesting, thank you.

Ajeya: What's your read on the companies' strategies?

Arvind: I agree with you — I've seen some executives at these companies explicitly state that strategy. I just have a different take on what constitutes their “minimum” effort — I think they've been forced, perhaps reluctantly, to put much more effort into product development than they'd hoped.

Ajeya: Yeah, back in 2015 when OpenAI was incorporated, they probably thought it might work more like inventing the nuclear bomb — one big insight they could develop and scale up. We're definitely not there. There's a spectrum from “invent one brilliant algorithm in a basement somewhere” all the way to “gather billions of data points for each specific job.” I want us to be way on the data-heavy end — I think that would be much better for safety and resilience because the key harms would first emerge in smaller forms, and we would have many chances to iterate against them (especially with tools powered by previous-generation AIs). We're not all the way there, but right now, it seems plausible we could end up pretty close to the “one-shot” end of the spectrum.

Concluding thoughts

My overall position is that a fast take-off is plausible, and that is reason enough to react. A big reason is that we have previously seen big jumps in capability, e.g. from GPT2 to GPT3, and Arvind does not present any compelling reasons why more scale/unhobbling/minor breakthroughs won’t cause more big jumps in capability. For example, somebody could feasibly discover some memory-scaffolding that allows AI systems to have robust and useful long-term memory.

Second, I am glad I did this in-depth reading of the article. I usually skim-read, and my original takeaway from skim-reading this conversation was that there was no obvious disagreement between Arvind and Ajeya: they seem to agree on what trends are worth tracking or what experiments would be useful doing. Now after writing this distillation, I have some (rudimentary) model of Ajeya’s and Arvind’s beliefs, which I can call upon when thinking about future trends or evaluating some new research.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI发展 Arvind Narayanan AI伦理 AI安全 技术扩散
相关文章