少点错误 11月09日 07:48
深入探讨大型语言模型(LLM)的理解力
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文作者以怀疑论者的视角,深入探讨了大型语言模型(LLM)在“理解”方面可能存在的缺失。文章通过一系列类比,如“地图碎片”和“一致性域”,形象地阐述了LLM可能缺乏全局性、一致性以及内在统一心智模型。作者认为,LLM可能只是在拼凑零散的信息片段,而非真正拥有对内容的深层理解,尤其是在处理复杂的逻辑推理和数学证明时,这种局限性尤为明显。文章还探讨了LLM在感知和修正不一致性方面的能力,并将其与人类的学习方式进行了对比。

🗺️ **信息碎片化处理**: 作者将LLM类比为一个装有无数地图碎片的袋子,认为LLM能够连接少量信息片段来回答问题,但难以整合跨越多张“地图”的复杂信息,也无法像人类一样从整体上理解和推断,例如无法提出“大陆漂移”这样的概念。

⚖️ **一致性与全局视角缺失**: 文章指出,LLM在推理过程中可能出现符号定义不一致的问题,就像在拼凑晶体图片时,会在不同区域形成局部一致但整体不协调的模式,而非构建一个全局统一、逻辑严谨的整体。

🧠 **缺乏统一心智模型**: 作者类比“失明症”(aphantasia),认为LLM在生成内容时,可能是在拼接零散的“小图片”,而非描绘一个预先存在的、统一的“大图景”。这表明LLM可能缺乏一个内在统一的、支持全局一致性的心智模型。

💡 **在线式学习与修正能力的差异**: 文章最后探讨了LLM在规模化后是否能像人类一样工作。作者提出,人类似乎能够“在线地”察觉和修正不一致性,而无需大幅度地扩展大脑,暗示LLM在这方面的能力可能与人类存在根本差异。

Published on November 8, 2025 11:37 PM GMT

When I put on my LLM skeptic hat, sometimes I think things like “LLMs don’t really understand what they’re saying”. What do I even mean by that? What’s my mental model for what is and isn’t going on inside LLMs minds?

First and foremost: the phenomenon precedes the model. That is, when interacting with LLMs, it sure feels like there’s something systematically missing which one could reasonably call “understanding”. I’m going to articulate some mental models below, but even if I imagine all those mental models are wrong, there’s still this feeling that LLMs are missing something and I’m not quite sure what it is.

That said, I do have some intuitions and mental models for what the missing thing looks like. So I’ll run the question by my intuitions a few times, and try to articulate those models.

First Pass: A Bag Of Map-Pieces

Imagine taking a map of the world, then taking a bunch of pictures of little pieces of the map - e.g. one picture might be around the state of Rhode Island, another might be a patch of Pacific Ocean, etc. Then we put all the pictures in a bag, and forget about the original map.

A smart human-like mind looking at all these pictures would (I claim) assemble them all into one big map of the world, like the original, either physically or mentally.

An LLM-like mind (I claim while wearing my skeptic hat) doesn't do that. It just has the big bag of disconnected pictures. Sometimes it can chain together three or four pictures to answer a question, but anything which requires information spread across too many different pictures is beyond the LLM-like mind. It would, for instance, never look at the big map and hypothesize continental drift. It would never notice if there's a topogical inconsistency making it impossible to assemble the pictures into one big map.

Second Pass: Consistent Domains

Starting from the map-in-a-bag picture, the next thing which feels like it’s missing is something about inconsistency.

For example, when tasked with proving mathematical claims, a common pattern I’ve noticed from LLMs is that they’ll define a symbol to mean one thing… and then make some totally different and incompatible assumption about the symbol later on in the proof, as though it means something totally different.

Bringing back the map-in-a-bag picture: rather than a geographical map, imagine lots of little pictures of a crystal, taken under an electron microscope. As with the map, we throw all the pictures in a bag. A human-like mind would try to assemble the whole thing into a globally-consistent picture of the whole crystal. An LLM-like mind will kinda… lay out a few pieces of the picture in one little consistent pattern, and then separately lay out a few pieces of the picture in another little consistent pattern, but at some point as it’s building out the two chunks they run into each other (like different crystal domains, but the inconsistency is in the map rather than the territory). And then the LLM just forges ahead without doing big global rearrangements to make the whole thing consistent.

That’s the mental picture I associate with the behavior of LLMs in proofs, where they’ll use a symbol to mean one thing in one section of the proof, but then use it in a totally different and incompatible way in another section.

Third Pass: Aphantasia

What’s the next thing which feels like it’s missing?

Again thinking about mathematical proofs… the ideal way I write a proof is to start with an intuitive story/picture for why the thing is true, and then translate that story/picture into math and check that all the pieces follow as my intuition expects.[1]

Coming back to the map analogy: if I were drawing a map, I’d start with this big picture in my head of the whole thing, and then start filling in pieces. The whole thing would end up internally consistent by default, because I drew each piece to match the pre-existing picture in my head. Insofar as I draw different little pieces in a way that doesn’t add up to a consistent big picture, that’s pretty strong evidence that I wasn’t just drawing out a pre-existing picture from my head.

I’d weakly guess that aphantasia induces this sort of problem: an aphantasic, asked to draw a bunch of little pictures of different parts of an object or animal or something, would end up drawing little pictures which don’t align with each other, don’t combine into one consistent picture of the object or animal.

That’s what LLMs (and image generators) feel like. It feels like they have a bunch of little chunks which they kinda stitch together but not always consistently. That, in turn, is pretty strong evidence that they’re not just transcribing a single pre-existing picture or proof or whatever which is already “in their head”. In that sense, it seems like they lack a unified mental model.

Fourth Pass: Noticing And Improving

A last piece: it does seem like, as LLMs scale, they are able to assemble bigger and bigger consistent chunks. So do they end up working like human minds as they get big?

Maybe, and I think that’s a pretty decent argument, though the scaling rate seems pretty painful.

My counterargument, if I’m trying to play devil’s advocate, is that humans seem to notice this sort of thing in an online way. We don’t need to grow a 3x larger brain in order to notice and fix inconsistencies. Though frankly, I’m not that confident in that claim.

  1. ^

    I don’t always achieve that ideal; sometimes back-and-forth between intuition and math is needed to flesh out the story and proof at the same time, which is what most of our meaty research looks like.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 LLM 人工智能 理解力 认知模型 AI skepticism Large Language Models AI Cognitive Models
相关文章