https://www.seangoedecke.com/rss.xml 10月02日
AI生成内容的使用规范:内容密度是关键
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了在使用AI生成内容时应遵循的社交规范。作者认为,关键在于确保AI输出的内容密度与人类创作的内容相当,即“非冗余”。只有当AI生成的内容充实、有价值,且不含过多无关信息时,才适合展示给他人。作者区分了AI翻译、事实性内容等可接受的场景,以及冗长、空泛的AI文本等“冗余”内容,强调了在代码生成等领域,同样需要关注内容的实质性,并提出AI辅助创作的价值在于提高效率和启发思路,而非简单地生成“看起来不错”但缺乏实质的内容。

✅ **AI内容的核心价值在于“非冗余”**:作者提出,判断AI生成内容是否可接受的关键在于其“内容密度”。只有当AI输出的内容与人类自己创作的内容一样充实、有价值时,才适合展示给他人。这意味着要避免AI生成那些冗长、空泛、信息量低的“冗余”内容,例如将简短的指令扩展成冗长的邮件。

✅ **区分可接受的AI内容场景**:文章指出,AI翻译和提供具体事实性内容是两种可以接受的AI生成内容形式。AI翻译能够保持与原文相同的内容密度,而事实性内容的本质就是信息本身。这两种场景下,即使内容由AI生成,也不会引发负面观感。

✅ **代码生成需关注内容实质**:在代码生成领域,AI生成的代码同样需要符合“非冗余”原则。这包括避免冗余的注释、不必要的代码重复、不遵循项目规范的代码风格,以及明显错误的逻辑。AI生成的代码应像人类编写的代码一样,经过审慎的思考和审查。

✅ **AI辅助创作的价值与挑战**:AI在代码生成等领域可以节省开发者查找代码库信息、理解系统的时间,并作为一种思考工具,帮助迭代和探索不同方案。然而,关键在于开发者需要像对待自己编写的代码一样,仔细审查AI生成的代码,确保其质量和适用性,而非仅仅依赖AI的表面输出。

In the early days of any new technology, the relevant social norms are still being workshopped. For mobile phones, that meant collectively deciding when and where phones should be on silent mode1. For instant messaging, that meant jumping right into the request instead of trying to do small talk first. What are the social norms we’re working out for AI right now? In my view, they’re about when it’s appropriate to show somebody AI-generated content.

For instance, it’s acceptable to run an email through ChatGPT to check for typos or unfinished sentences. But is it acceptable to get ChatGPT to write your entire email for you from scratch? If you’re having a problem in your life, it’s okay to talk to a language model about it. But it’s probably not okay to get a language model to write messages to your spouse (though it is okay to get it to write messages to your boss). If you’re a software engineer, is it okay to submit an AI-generated PR? Is it okay to get AI to generate commit messages or technical reports for you?

Here’s the point I’m going to argue in this post: it’s okay to show someone AI-generated output when it’s not slop. In other words, you should only show AI output when it’s as content-dense as something you would have produced yourself.

Should you never pass off AI-generated content as your own?

Six months ago, I wrote about this same question in On Slop, and concluded that you should never pass off AI-generated content as your own. I thought that the core problem with “slop” was the bait-and-switch: the unpleasant uncanny-valley sensation when you realise you’re not talking to a human. We’re seeing the beginnings of a norm forming where you have to disclose whether content is AI-generated upfront (for instance, three days ago the ghostty project made that mandatory for new PRs).

One intuitive reason to believe this is that it’s much more pleasant to read your own AI-generated content than to read AI content generated by someone else. W.H. Auden famously wrote in his essay Writing that “most people enjoy the sight of their own handwriting as they enjoy the smell of their own farts”. Is AI content the same?

I was reading the comments on this post2 when I noticed lots of people resonating with this remark:

I can’t know for sure, but I’d be willing to put money down that my exact question and the commit were fed directly into an LLM which was then copy and pasted back to me. I’m not sure why, but I felt violated. It felt wrong.

For instance (emphasis mine):

I think the key here is no-one wants to read someone else’s ChatGPT. Even when it’s been vetted by another engineer via actual tests, there is something that in my flesh that makes me not want to read that output?

I won’t say too much, but I recently had an experience where it was clear that when talking with a colleague, I was getting back chat GPT output. I felt sick, like this just isn’t how it should be. I’d rather have been ignored.

I agree that there’s a fundamental difference between reading your own ChatGPT output and reading someone else’s. I will happily talk to GPT-5 all day3, but I cannot imagine ever wanting to read the outputs of another person’s prompts. When I encounter those outputs accidentally, I get a genuine disgust reaction. Like the people I quoted above, there is something “in my flesh” that recoils.

Exceptions to the rule

Is the ideal social norm then that you should always disclose AI-generated content upfront? I don’t think so, for two reasons.

First, that norm doesn’t go far enough. There are many sources of disclosed AI-generated content that still revolt me (and the commenters I quoted above). When Quora introduced AI-generated questions and answers, I was bothered by them even though they were clearly indicated as such. On Twitter, I know the @grok account is AI generated, but I still don’t like reading its replies. If I get a work message from a colleague that expands a short “can u add caching” into a three-paragraph email, I hate reading it even if I know upfront that it’s AI-generated.

Second, there are also key exceptions to the norm: times when I think it’s acceptable to show people AI-generated content, even if you don’t disclose it.

The first exception is translation. If I’m talking to someone who doesn’t speak English fluently, and they’re running their messages through some AI language model, that’s fine4. The difference is that I’m still talking to them, even if there’s a language model in the middle. They’re not prompting “respond to this message”, they’re prompting “translate [my response] into English”.

The second exception is concrete technical content. I don’t mind reading concise AI-generated facts. If I’m researching industrial accidents and someone sends me a dot-point list of railroad accidents from the early 1900s, I don’t care if it’s AI-generated (as long as it’s accurate). I only get a disgust reaction when I see an entire AI-generated paragraph: in the worst case, a post with several paragraphs, sub-headers, and an expansive conclusion.

Because of this, I don’t think “always disclose AI-generated content” is a very good norm. There’s no harm in it, at least- if you want to disclose it, that’s fine, and if you want it to be disclosed to you, that’s also fine. But it doesn’t articulate why AI slop is bad, and it doesn’t give adequate guidance to people looking to use AI responsibly.

Content-sparse AI outputs

Why then does it feel so unpleasant to read someone’s “can u add caching” prompt translated into a three-paragraph polite email, even though I don’t mind AI-generated translations or lists of facts? The difference here is in the content density. You can’t expand two words into three paragraphs without saying more things, and language models typically fill in those “more things” in the most anodyne and generic way possible. For instance, in this case they might add:

You would never communicate so vaguely if you were writing it out yourself. Instead, you would be much more specific:

A language model could fill in that information for you, but you’d have to do the work to figure those things out and feed it that information in advance. In theory, this doesn’t even require a human. An agentic process could do the work itself, if it were able to be sufficiently concise5. So we’ve arrived at a general principle for when AI-generated output is acceptable:

It’s okay to show someone AI-generated output when the content density in that output is the same as what you’d get from a competent human.

Translation is acceptable because you’re getting the exact same content density as the original-language prompt. Lists of concrete facts are acceptable because they’re nothing but content. But when someone blows up two or three dot points into a several-paragraph post, the content density is too low, and it triggers the disgust response. Likewise, when a Twitter bot is told “respond to this post”, it usually has nothing to say, so it outputs something like “wow, [message of the post] is so insightful!”, which is effectively zero-content.

What about code?

What about AI-generated code? Code can be content-dense or content-sparse in at least three ways. First, it can be redundant: either by containing many obvious comments, like Claude Sonnet 4 always does, or by performing operations that need not be performed6. Second, it can do things its own way instead of following existing conventions and helpers in the codebase. If I generate a PR with Claude Code and 90% of it constructs a redundant cache system, at most 10% of that PR can contain useful content. Third, it can be straight-up nonsense! Many AI-generated PRs are completely baffling - they do things that could never possibly work, such as saving results to a file “for extra durability” in a Docker-run service with no persistent disk storage.

Given that, I think the principles for AI-generated code are the same for any other kind of AI-generated content. It’s okay to submit an AI-generated PR when it’s content-dense: in other words, when it doesn’t repeat itself, fits into the codebase nicely, and when it doesn’t do anything obviously silly. That means you have to treat an AI-generated PR like a normal PR. You have to think through it as carefully as you would if you wrote it by hand.

A lot of people fail at this hurdle. Once generated, the PR often looks “good enough”. At a glance, it seems fine. If you have to dig in and work through it like you would a normal PR, what’s the point of generating it with AI in the first place? An AI skeptic would treat this question as rhetorical. But I think there’s still a lot of value.

First, the generated PR might reference parts of the codebase that you didn’t already know about, which saves you the time of having to hunt them down. Second, reading code is faster than writing it, because you can spend your entire brainpower on understanding the system instead of generating syntax. Third, a generated PR can be a useful thinking tool even if you end up discarding it. It’s easy to handwave some functionality in your head, but looking at it expressed as code is an entirely different matter. I often generate some code, read through it, decide that it can’t be the right approach, and then go down a totally different direction. That process of iteration would be much slower if I did it by hand.

Final thoughts

It’s tempting to say that responsible AI usage is all about disclosing when you’re using AI. However, that’s not correct. Responsible AI usage is about avoiding slop. That means making sure that any AI outputs you show to other people are content-dense: in other words, that they’re as concise as what you would have written without AI.

If you’re showing people AI outputs that have actual content and that are expressed concisely, it doesn’t really matter whether you disclose or not. And even if you do disclose, you still shouldn’t be putting AI “slop” in front of people - AI outputs that are vague, redundant, and contain only one or two pieces of content in an avalanche of words.


  1. For instance: yes in church, no in the supermarket, and “it depends” at work or home.

  2. I enjoyed the first half of the post but thought the second half got way over its skis.

  3. “Talk to” here means “use as a tool for learning”, not “have a human conversation with”. See my previous post on how I learn with LLMs.

  4. Is this exception conflating AI translation with generative AI? I don’t think so. First, state-of-the-art translation is generative - there’s no fundamental technical difference, because good translation needs a model that understands both languages. Second, many people use language models to translate in a non-traditional way. Instead of writing in their native language and hitting “translate”, they might write in a pidgin: using English where they feel confident they know how to express themselves, and falling back into their native language where they’re not so sure. Language models are excellent at translating that kind of thing into fluent English.

  5. Incidentally, this is why I often prefer GPT-5 to Sonnet-4 for agentic coding. Sonnet 4 writes long reports that I have to skim or ignore, while GPT-5 is much more concise and content-dense.

  6. Two examples: Claude likes to rescue and re-raise exceptions in multiple different places when it writes Typescript, and all coding models will usually try to build in unnecessary “enterprise-ready” epicycles into simple scripts (e.g. elaborate logging instead of print statements).

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI生成内容 内容密度 社交规范 AI伦理 代码生成 Responsible AI Content Density Social Norms AI Ethics Code Generation
相关文章