AI生成内容的使用规范：内容密度是关键

In the early days of any new technology, the relevant social norms are still being workshopped. For mobile phones, that meant collectively deciding when and where phones should be on silent mode¹. For instant messaging, that meant jumping right into the request instead of trying to do small talk first. What are the social norms we’re working out for AI right now? In my view, they’re about when it’s appropriate to show somebody AI-generated content.

For instance, it’s acceptable to run an email through ChatGPT to check for typos or unfinished sentences. But is it acceptable to get ChatGPT to write your entire email for you from scratch? If you’re having a problem in your life, it’s okay to talk to a language model about it. But it’s probably not okay to get a language model to write messages to your spouse (though it is okay to get it to write messages to your boss). If you’re a software engineer, is it okay to submit an AI-generated PR? Is it okay to get AI to generate commit messages or technical reports for you?

Here’s the point I’m going to argue in this post: it’s okay to show someone AI-generated output when it’s not slop. In other words, you should only show AI output when it’s as content-dense as something you would have produced yourself.

Should you never pass off AI-generated content as your own?

Six months ago, I wrote about this same question in On Slop, and concluded that you should never pass off AI-generated content as your own. I thought that the core problem with “slop” was the bait-and-switch: the unpleasant uncanny-valley sensation when you realise you’re not talking to a human. We’re seeing the beginnings of a norm forming where you have to disclose whether content is AI-generated upfront (for instance, three days ago the ghostty project made that mandatory for new PRs).

One intuitive reason to believe this is that it’s much more pleasant to read your own AI-generated content than to read AI content generated by someone else. W.H. Auden famously wrote in his essay Writing that “most people enjoy the sight of their own handwriting as they enjoy the smell of their own farts”. Is AI content the same?

I was reading the comments on this post² when I noticed lots of people resonating with this remark:

I can’t know for sure, but I’d be willing to put money down that my exact question and the commit were fed directly into an LLM which was then copy and pasted back to me. I’m not sure why, but I felt violated. It felt wrong.

For instance (emphasis mine):

I think the key here is no-one wants to read someone else’s ChatGPT. Even when it’s been vetted by another engineer via actual tests, there is something that in my flesh that makes me not want to read that output?

I won’t say too much, but I recently had an experience where it was clear that when talking with a colleague, I was getting back chat GPT output. I felt sick, like this just isn’t how it should be. I’d rather have been ignored.

I agree that there’s a fundamental difference between reading your own ChatGPT output and reading someone else’s. I will happily talk to GPT-5 all day³, but I cannot imagine ever wanting to read the outputs of another person’s prompts. When I encounter those outputs accidentally, I get a genuine disgust reaction. Like the people I quoted above, there is something “in my flesh” that recoils.

Exceptions to the rule

Is the ideal social norm then that you should always disclose AI-generated content upfront? I don’t think so, for two reasons.

First, that norm doesn’t go far enough. There are many sources of disclosed AI-generated content that still revolt me (and the commenters I quoted above). When Quora introduced AI-generated questions and answers, I was bothered by them even though they were clearly indicated as such. On Twitter, I know the @grok account is AI generated, but I still don’t like reading its replies. If I get a work message from a colleague that expands a short “can u add caching” into a three-paragraph email, I hate reading it even if I know upfront that it’s AI-generated.

Second, there are also key exceptions to the norm: times when I think it’s acceptable to show people AI-generated content, even if you don’t disclose it.

The first exception is translation. If I’m talking to someone who doesn’t speak English fluently, and they’re running their messages through some AI language model, that’s fine⁴. The difference is that I’m still talking to them, even if there’s a language model in the middle. They’re not prompting “respond to this message”, they’re prompting “translate [my response] into English”.

The second exception is concrete technical content. I don’t mind reading concise AI-generated facts. If I’m researching industrial accidents and someone sends me a dot-point list of railroad accidents from the early 1900s, I don’t care if it’s AI-generated (as long as it’s accurate). I only get a disgust reaction when I see an entire AI-generated paragraph: in the worst case, a post with several paragraphs, sub-headers, and an expansive conclusion.

Because of this, I don’t think “always disclose AI-generated content” is a very good norm. There’s no harm in it, at least- if you want to disclose it, that’s fine, and if you want it to be disclosed to you, that’s also fine. But it doesn’t articulate why AI slop is bad, and it doesn’t give adequate guidance to people looking to use AI responsibly.

Content-sparse AI outputs

Why then does it feel so unpleasant to read someone’s “can u add caching” prompt translated into a three-paragraph polite email, even though I don’t mind AI-generated translations or lists of facts? The difference here is in the content density. You can’t expand two words into three paragraphs without saying more things, and language models typically fill in those “more things” in the most anodyne and generic way possible. For instance, in this case they might add:

We’re recomputing everything from scratch, which can be a bit costly in terms of performanceEven a simple in-memory or file-based cache would already provide noticeable improvementsCaching is important to avoid unnecessary work

You would never communicate so vaguely if you were writing it out yourself. Instead, you would be much more specific:

On each request we’re re-fetching the user’s current configuration, which adds ~20ms of latencyThis service already has a Redis instance set up that you could use for caching The configuration service is struggling to scale, so it’d be good if we could take some of the pressure off

A language model could fill in that information for you, but you’d have to do the work to figure those things out and feed it that information in advance. In theory, this doesn’t even require a human. An agentic process could do the work itself, if it were able to be sufficiently concise⁵. So we’ve arrived at a general principle for when AI-generated output is acceptable:

It’s okay to show someone AI-generated output when the content density in that output is the same as what you’d get from a competent human.

Translation is acceptable because you’re getting the exact same content density as the original-language prompt. Lists of concrete facts are acceptable because they’re nothing but content. But when someone blows up two or three dot points into a several-paragraph post, the content density is too low, and it triggers the disgust response. Likewise, when a Twitter bot is told “respond to this post”, it usually has nothing to say, so it outputs something like “wow, [message of the post] is so insightful!”, which is effectively zero-content.

What about code?

What about AI-generated code? Code can be content-dense or content-sparse in at least three ways. First, it can be redundant: either by containing many obvious comments, like Claude Sonnet 4 always does, or by performing operations that need not be performed⁶. Second, it can do things its own way instead of following existing conventions and helpers in the codebase. If I generate a PR with Claude Code and 90% of it constructs a redundant cache system, at most 10% of that PR can contain useful content. Third, it can be straight-up nonsense! Many AI-generated PRs are completely baffling - they do things that could never possibly work, such as saving results to a file “for extra durability” in a Docker-run service with no persistent disk storage.

Given that, I think the principles for AI-generated code are the same for any other kind of AI-generated content. It’s okay to submit an AI-generated PR when it’s content-dense: in other words, when it doesn’t repeat itself, fits into the codebase nicely, and when it doesn’t do anything obviously silly. That means you have to treat an AI-generated PR like a normal PR. You have to think through it as carefully as you would if you wrote it by hand.

A lot of people fail at this hurdle. Once generated, the PR often looks “good enough”. At a glance, it seems fine. If you have to dig in and work through it like you would a normal PR, what’s the point of generating it with AI in the first place? An AI skeptic would treat this question as rhetorical. But I think there’s still a lot of value.

First, the generated PR might reference parts of the codebase that you didn’t already know about, which saves you the time of having to hunt them down. Second, reading code is faster than writing it, because you can spend your entire brainpower on understanding the system instead of generating syntax. Third, a generated PR can be a useful thinking tool even if you end up discarding it. It’s easy to handwave some functionality in your head, but looking at it expressed as code is an entirely different matter. I often generate some code, read through it, decide that it can’t be the right approach, and then go down a totally different direction. That process of iteration would be much slower if I did it by hand.

Final thoughts

It’s tempting to say that responsible AI usage is all about disclosing when you’re using AI. However, that’s not correct. Responsible AI usage is about avoiding slop. That means making sure that any AI outputs you show to other people are content-dense: in other words, that they’re as concise as what you would have written without AI.

If you’re showing people AI outputs that have actual content and that are expressed concisely, it doesn’t really matter whether you disclose or not. And even if you do disclose, you still shouldn’t be putting AI “slop” in front of people - AI outputs that are vague, redundant, and contain only one or two pieces of content in an avalanche of words.

For instance: yes in church, no in the supermarket, and “it depends” at work or home.
↩
I enjoyed the first half of the post but thought the second half got way over its skis.
↩
“Talk to” here means “use as a tool for learning”, not “have a human conversation with”. See my previous post on how I learn with LLMs.
↩
Is this exception conflating AI translation with generative AI? I don’t think so. First, state-of-the-art translation is generative - there’s no fundamental technical difference, because good translation needs a model that understands both languages. Second, many people use language models to translate in a non-traditional way. Instead of writing in their native language and hitting “translate”, they might write in a pidgin: using English where they feel confident they know how to express themselves, and falling back into their native language where they’re not so sure. Language models are excellent at translating that kind of thing into fluent English.
↩
Incidentally, this is why I often prefer GPT-5 to Sonnet-4 for agentic coding. Sonnet 4 writes long reports that I have to skim or ignore, while GPT-5 is much more concise and content-dense.
↩
Two examples: Claude likes to rescue and re-raise exceptions in multiple different places when it writes Typescript, and all coding models will usually try to build in unnecessary “enterprise-ready” epicycles into simple scripts (e.g. elaborate logging instead of print statements).
↩

Should you never pass off AI-generated content as your own?

Exceptions to the rule

Content-sparse AI outputs

What about code?

Final thoughts

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签