Published on October 13, 2025 9:49 PM GMT

Thesis Statement^[1]

Current arguments for AGI can be distilled to arguments for specific capabilities, not for generality in itself. We need to examine whether there exists a genuine and sound argument for generality as an independent property.

Introduction

In Plato's Republic, Glaucon's challenge to Socrates is to show him why justice is good in and of itself; instead of arguing for its instrumentality. In other words, Socrates has to show Glaucon that we value justice itself, not merely for its after-effects:

"For I want to hear what justice and injustice are, and what power each has when it is just by itself in the soul. I want to leave out of account the rewards and the consequences of each of them." (Plato, Republic, 358b-c)

Following Glaucon's spirit, I dare ask: is generality in AI valuable in itself, or do we follow it merely for its expected instrumental effects?

Dialectic

The problem of reduction

When leading labs say "we're building towards AGI," what do they really mean? If we enumerate all the capabilities they desire (mathematical reasoning, long-horizon tasks, automated R&D and real-world economic tasks, ...) does anything remain in the term AGI after we subtract this list? Or is AGI simply a short name for "all of these capabilities together"?

Most, if not all, pro-generality arguments seem to be reducible to:

"We want adaptability" – which is a specific capability"We want transfer learning" – again, a specific capability"We want to solve multiple issues" – this seems to be a set of specific capabilities

It doesn't seem to be wrong, then, to ask whether generality is the name we give to a sufficiently big conjunction of specific capabilities, or whether there is something qualitatively distinct: generality itself.

The subtraction test: If we could have all the specific capabilities that AGI promises, but without 'generality' (whatever that means, maybe we have all the capabilities but in separate, narrow models), would we lose any value?

The missing argument: intrinsic value

No one seems to argue that generality has value in itself (as we could argue about consciousness or wellbeing). Why not? Maybe because AI (seemingly) is instrumental by nature. So, why do we want generality? And, is that really what we want?

The argument of cognitive economy / from cognitive economics

A general system may be more efficient than maintaining a comprehensive set of narrow systems because:

Share representations across domains, reducing redundant learning and enabling knowledge-transfer.Reduces computational cost and redundancy.Allows for emergence of currently unknown capabilities through unexpected transfer.

But there seems to be an implicit assumption here, that is, it assumes that the cost of maintaining generality will be lower than the sum costs of ANIs (costs of development, inference, and maintenance). Is this empirically true? Could we build accurate mathematical cost models?

Currently, foundation models are very expensive to train and operate, and pushing the frontier is not getting any cheaper. Meanwhile, specialized models are much more efficient. So far, it seems that, if we think in terms of cost/benefit, empirical evidence may favor specialized models.

Moreover, this argument also seems to assume that shared representations are necessarily beneficial. Yet, in ML, it is well known that there are many trade-offs. A model aimed at doing everything may suffer from catastrophic forgetting or negative transfer.

The scaling hypothesis and two types of inevitability

Arguments for AGI often conflate two distinct claims about inevitability:

Socioeconomic inevitability: "competition forces us," "someone will build it anyway," "it's the next natural step." These are claims about coordination problems and race dynamics, Molochian pressures that make AGI development feel unstoppable regardless of whether it's wise.Technical inevitability: the scaling hypothesis (that model capabilities improve predictably with increased compute, data, and parameters) suggests generality may not be something we choose to pursue, but something that emerges automatically from scaling.

The distinction matters. Socioeconomic inevitability is a governance problem which suggests we need coordination mechanisms. On the other hand, technical inevitability is a scientific claim which suggests generality will emerge whether we coordinate or not.

Let's focus on the technical claim. If this view is correct, then asking "should we build generality" becomes moot. Generality would be an inevitable byproduct of scaling up systems initially designed for narrow tasks (such as next-token-prediction). We wouldn't be necessarily aiming for generality, rather, we'd simply observe its emergence.

But this argument smuggles in a few assumptions:

^[2]

breadth of capabilities

The meta-solver argument

This argument states that it'll be easier to build AGI and have it solve all other specific problems, than to solve every problem independently. This argument tends to come with the easily-repeated slogan "it'll be our last invention".

Some possible issues with this argument:

^[3]

The argument from unknown unknowns

One could argue that we cannot know in advance what issues we may need to solve, and that generality gives us that flexibility to respond to unknown unknowns.

Yet this again seems to be an instrumental argument for, say, flexibility or adaptability, not for generality in itself. Moreover, what warrants us to assume that generality equals adaptability?^[4] The most adaptable biological systems we know (bacterias) are not the most general.

Breadth or generality?

Perhaps we conflate breadth of capabilities with generality. Consider two systems:

System A: 1000 specific capabilities, without transfer between themSystem B: 100 capabilities that generalize to new domains that are only slightly out of distribution

What is more valuable? The answer seems to hinge on whether System B can sustain chains of generalization, using domain X to solve slightly-OOD domain Y, then using that to tackle even-further-OOD domain Z. If yes, then generality represents something genuinely powerful. If not, then System A's breadth may be superior. This latter case would suggest we actually value sufficient breadth, not generality per se.^[5]

Open questions

Do any benefits attributed to AGI actually require generality, or merely sufficient breadth of capabilities?Is generality a real property or a convenient abstraction?If no sound argument exists for generality in itself, should we pivot toward developing the right set of highly-capable narrow systems?Does this same issue apply to ASI?

Conclusion

Paradoxically, the lack of a solid argument for generality in and of itself does not seem to mean we should not keep trying to build AGI. Rather, it means we should be honest about why we are building it. Maybe we are building it not because we see value in generality itself, but because:

It seems inevitable given current incentivesWe believe (maybe incorrectly) that it will be more efficientWe want specific capabilities that we don't yet know how to build, and believe a general system would, in virtue of being general, solve themThe scaling hypothesis suggests generality may emerge whether we aim for it or not

This clarity isn't merely for philosophical amusement, it matters for determining research priorities and governance efforts. If we're building towards AGI for instrumental reasons, we should:

Measure progress by the capabilities that matter, not proximity to some abstract notion of generality.Invest heavily in alignment for narrow systems, since "wait until AGI to solve alignment" is not a plan.Question whether scaling toward emergent generality is safer than deliberately engineering the specific capabilities we want.Distinguish between breadth (many capabilities we care about) and generality (chainable OOD transfer), since these have different safety profiles.

I think the fundamental question remains: are we building toward the right target, and do we even know what that target is?

I welcome counterarguments. If there exists a sound intrinsic argument for generality that I've missed, I'd genuinely like to hear it.

^{^}
I want to thank BlueDot Impact for accepting me into their inagural cohort of "AGI Strategy" where this discussion arose. This post would not exist without their great efforts to build the much needed Safety workforce.
^{^}
Ganguli et al. (2022) "Emergent Abilities of Large Language Models", Wei et al. (2022) "Predictability and Surprise in Large Generative Models", Hoffmann et al. (2022) "Training Compute-Optimal Large Language Models" (Chinchilla paper)
^{^}
https://en.wikipedia.org/wiki/No_free_lunch_theorem
^{^}
There may be a good argument to be developed here, if one can successfully argue that adaptability is an intrinsic component of generality, and not a mere after-effect.
^{^}
This formulation of generality as chainable out-of-distribution transfer draws on work in meta-learning and few-shot transfer learning. See Jiang et al. (2023), Tripuraneni et al. (2022), Sun et al. (CVPR 2019), and Ada et al. (2019) for theoretical foundations on OOD generalization and transfer bounds.
^{^}
https://www.lesswrong.com/posts/BqoE5vhPNCB7X6Say/superintelligence-12-malignant-failure-modes

Discuss

Thesis Statement^[1]

Introduction

Dialectic

The problem of reduction

The missing argument: intrinsic value

The argument of cognitive economy / from cognitive economics

The scaling hypothesis and two types of inevitability

The meta-solver argument

The argument from unknown unknowns

Breadth or generality?

Open questions

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签

Thesis Statement[1]

Introduction

Dialectic

The problem of reduction

The missing argument: intrinsic value

The argument of cognitive economy / from cognitive economics

The scaling hypothesis and two types of inevitability

The meta-solver argument

The argument from unknown unknowns

Breadth or generality?

Open questions

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签

Thesis Statement^[1]