Published on September 20, 2025 11:01 AM GMT

TL;DR

Most “AGI ban” proposals define AGI by outcome: whatever potentially leads to human extinction. That’s legally insufficient: regulation has to act before harm occurs, not after.

Strict liability is essential.

Fuzzy definitions won’t work here.

Define crisp thresholds.

Bring lawyers in.

Bottom line: If we want a credible AGI ban, we must define and prohibit precursor capabilities with the same precision and enforceability that nuclear treaties applied to fissile material. Anything less collapses into Goodharting and fines-as-business-cost.

Why outcome-based AGI bans' proposals don’t work

I keep seeing proposals framed as “international bans on AGI,” where AGI is defined extremely broadly, often something like “whatever AI companies develop that could lead to human extinction.” As a lawyer, I can’t overstate how badly this type of definition fails to accomplish its purpose. To enable a successful ban on AGI, regulation has to operate ex ante: before the harm materialises.

If the prohibited category is defined only by reference to the ultimate outcome (in this case, human extinction), then by construction the rule cannot trigger until after the damage is done. At that point the law is meaningless; extinction leaves no survivors to enforce it.

That’s why the definitional work must focus not on the outcome itself, but on the specific features that make the outcome possible: the capabilities, thresholds, or risk factors that causally lead to the catastrophic result. Other high-stakes tech domains already use this model. Under the European General Data Protection Regulation, companies can be fined simply for failing to implement adequate security measures, regardless of whether an intentional breach has occurred. Under European product liability law, a manufacturer is liable for a defective product even if they exercised all possible care to prevent such defect. And even under U.S. export-control law, supplying restricted software without a licence is an offence regardless of intent.

The same logic applies here: to ban AGI responsibly, we need to ban the precursors that make extinction-plausible systems possible, not "the possibility of extinction" itself.

The luxury of "defining the thing" ex post

TsviTB, in What could a policy banning AGI look like?, poses a fair question: “Is it actually a problem to have fuzzy definitions for AGI, when the legal system uses fuzzy definitions all the time?”

In ordinary law, the fuzziness is tolerable because society can absorb the error. If a court gets it wrong on whether a death was murder or manslaughter, the consequences are tragic but not civilisation-ending. And crucially, many of these offences carry criminal penalties (actual prison time) which creates a strong incentive not to dance around the line.

But an AGI ban is likely to sit under product safety law, at least initially (more on this next), where penalties are usually monetary fines. That leaves the door wide open for companies to “Goodhart” their way into development: ticking compliance boxes while still building systems that edge toward the prohibited zone.

This is a painfully common dynamic in corporate law. For example, multinationals routinely practice tax avoidance right up to the legal line of tax evasion. They can prove with audited books that what they’re doing is legal, even if it clearly undermines the spirit of the law. They still avoid paying far more tax than they would under a normal corporate structure, one SMEs can’t afford to set up.

In practice, they achieve the very outcome the law was designed to prevent, but they do it legally. They don’t get away with all of it, but they get away with maybe 80%.

We don't have this luxury: we cannot afford an AGI ban that is “80% avoided.” Whether the framework sits under civil or criminal law, it will only work if it sets a robust, precise threshold and attaches penalties strong enough to change incentives, not just fines companies can write off as a cost of doing business.

Actually defining the thing we want to ban

If an agreement like this is to work, the first item on the agenda must be to define what counts as the thing you want to ban.

Why do I think an AGI ban will default to a product safety framework rather than a criminal law framework? Because that’s the path the EU AI Act has already taken. It sets the first precedent for “banning” AI: certain systems are prohibited from being put on the market or deployed when they pose irreversible societal harms (e.g. manipulative systems, exploitative targeting, biometric categorisation, social scoring).

For readers less familiar with the EU AI Act, here’s the exact list of practices it bans under Article 5:

Summary of Prohibited AI Practices under Art.5 EU AI Act

The EU AI Act prohibits placing on the market, putting into service, or using AI systems that:

Manipulate behaviour

Exploit vulnerabilities

Social scoring

Predict criminality

Facial scraping

Emotion recognition

Biometric categorisation

Real-time biometric identification (RBI)

But notice the structure:

doesn’t ban development

civil liability

ex post

This is exactly the failure mode I worry about. If an “AGI ban” is drafted the same way, it will look tough on paper but in practice it will be little more than a product-safety regulation: companies treating fines as a cost of doing business, and governments tolerate it for the sake of “innovation".

That’s why the definitional work matters so much. If the ban is going to be enforceable, we can’t define AGI in terms of its final outcome (extinction) or leave it to vague product-safety language.

If the AI Safety field fails to define the thing we want to ban, the task will be left up to policymakers, who will reach for the "most measurable" available proxy ( likely compute thresholds) and the entire ecosystem will Goodhart against that.

We need a crisp proxy for what makes a system cross the danger line: capability evaluations, autonomy thresholds, demonstrable ability to replicate, deceive, or seize resources. Something specific enough that strict liability can attach before catastrophe, not after.

Credible bans depend on bright lines

The political reality is that states are unlikely to agree to halt all frontier development. This is clear for anyone who has read the U.S.' AI Action plan. Even Europe, often seen as the strictest regulator, is taking an ambitious "pro-innovation" stance.

If a proposal for an AGI ban is to succeed, it has to be precise enough to block “AGI coup”-class systems while still permitting beneficial progress. Otherwise, it will be dismissed as too restrictive or difficult to actually enforce without expensive innovation caps.

Learning from nuclear treaties

Nuclear treaties don’t ban “whatever weapons might end humanity.” They ban specific precursor states and activities with crisp, enforceable thresholds: zero-yield tests, “significant quantities” of fissile material (8 kg of plutonium, 25 kg of HEU), and delivery systems above 500 kg/300 km. These bright lines allow intrusive verification and enforcement, while still permitting conventional weapons to exist^[1]. The principle is clear: regulate by measurable inputs and capabilities, not by catastrophic outcomes.

Until similar definitional work is done for AGI, talk of an “AGI ban” is rhetoric. Useful rhetoric, yes!: It shifts the Overton window, it mobilises groups like PauseAI and ControlAI, it keeps extinction-level risk on the policy agenda. But as law, it will fail unless we solve the definitional problem and bring legal experts into the room (not just policymakers^[2].). Otherwise, in-house counsels at AI labs will simply map the gray areas and find compliant-looking ways to bypass the ban.

If we want a credible ban, we need to put the same intellectual effort into defining precursor thresholds for AI that we once put into fissile material, missile ranges, and test yields. Anything less will collapse into Goodharting and "product safety rules" with fines as the only compliance incentive.

^{^}
I do not endorse the manufacturing of any weapons, I am merely using this as an example for illustrative purposes.
^{^}
One blind spot I notice is how rarely tech lawyers are brought into AI safety strategy. Lawyers in large firms or in-house roles often sit at the chokepoints of real leverage: they can delay deals, rewrite contractual terms, and demand transparency in ways that policymakers often cannot. In the EU especially, an AGI ban (or any ambitious legislative action) would ultimately be implemented, interpreted, and either undermined or strengthened by these lawyers. If they are left out of the conversation, the path of least resistance will be to map the gray areas and advise clients on how to bypass the spirit of the law.

Discuss

TL;DR

Why outcome-based AGI bans' proposals don’t work

The luxury of "defining the thing" ex post

Actually defining the thing we want to ban

Credible bans depend on bright lines

Learning from nuclear treaties

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签