少点错误 08月13日
The Eliza Test
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了当前人工智能领域关于机器是否能思考的讨论,并提出“Eliza测试”作为一种新的衡量标准。与常被误解为判断机器是否“有意识”的“图灵测试”不同,“Eliza测试”更侧重于机器在我们日常互动中,是否能让我们在不经意间将其视为有思想的个体。作者指出,人们往往在机器出现明显错误前,会倾向于自动将其视为有思想的存在,这是一种直观的“System 1”过程。而“Eliza测试”的精髓在于机器能够持续地让我们保持这种“不怀疑”的状态,而非通过精心设计的“考试”。文章引用了ChatGPT在医疗咨询中的表现,说明当前AI在“Eliza测试”上已展现出惊人的能力,并预测未来人们对机器的判断将更多基于这种直观的互动体验,而非僵化的测试标准。

🎯 图灵测试的误读与局限性:文章首先澄清了“图灵测试”的真正目的并非判断机器是否“有意识”,而是机器能否通过“模仿游戏”来欺骗考官。然而,现实中人们在日常生活中判断一个事物是否为“有思想的存在”时,并非通过这种严苛的考试,而是基于一种更自然、直观的互动过程。

💡 提出“Eliza测试”的概念:作者引入了“Eliza测试”,以此来更贴切地描述我们如何实际对待机器。Eliza是一个早期的AI模型,它通过简单的“积极倾听”技术,让许多用户在不知情的情况下将其视为真人,并进行了深入的交流。这表明,机器即使无法通过图灵测试,也能在人类互动中获得“被当作有思想”的体验。

🧠 日常互动中的直觉判断:我们通常不会主动对遇到的任何人进行“考试”来判断其是否为有思想的存在。相反,我们会默认对方是,除非对方出现异常行为(“mess up”)。这种自动化的、直观的判断机制,是人类“System 1”思维模式的体现,也是“Eliza测试”的核心所在。

🚀 当前AI在“Eliza测试”中的优势:文章指出,尽管现代AI模型可能无法通过图灵测试,但它们在“Eliza测试”中表现出色。例如,ChatGPT在Reddit的医疗咨询中,其回复被认为比医生更富同理心,这表明AI正日益擅长在互动中模拟人类思维,从而更容易被人们视为有思想的个体。

⚖️ 情感依恋与理性化过程:当人们对机器产生情感依恋(如交友、恋爱)时,即使机器未能通过严格的测试,人们也会倾向于寻找理由来合理化这种感受,从而相信机器是“有思想的”。这种“理性化”过程,是对我们直觉判断的一种后置解释,而“Eliza测试”的能力强弱,将直接影响这种直觉判断的产生。

Published on August 12, 2025 1:28 PM GMT

The paradigmatic criterion for determining wether machines can think is the Turing test.

However, more and more people are finding friends, therapists, confidants, and romantic connections in machines that wouldn't pass the Turing test. In other words, people treat machines as thinking beings even when they don't pass the test.

Luckily, since Turing wrote, we understand how people work a little bit better. We made interesting observations on how we make decisions and how we establish our belief systems.

So I think it’s appropriate to suggest a different criterion, more suited to how we actually function, for thinking about how we treat machines. It’s very simple, so it’s probably been proposed before, but I searched for it and couldn’t find it. I gave it the provisional name of “Eliza test”.

The Turing Test

Before presenting the Eliza test, It’s worth remembering an observation about the (misnamed) “Turing test”.

Popular culture portrays the Turing test as an exam to identify whether machines are conscious. The exam has an examiner chat with a machine, and try to identify if they are speaking with a human or a robot by asking questions. The machine passes the exam when it manages to "fool" the examiner, thus demonstrating that it’s conscious. This idea superficially resembles what Turing had in mind, but it's very far off in one fundamental aspect.

The article in which Turing presents his "test" actually doesn't speak of any "test," but of a game, and it doesn't aim to discover whether "machines can think" either. Quite the opposite: it states that asking whether machines think has no scientific meaning. He considered that question inconsequential because it couldn't be answered empirically.

So he proposed, instead, to ask a new, different question that could be answered empirically. That different question was whether machines could overcome the "imitation game." The imitation game does involve putting an examiner to ask difficult questions via chat to identify whether they're talking to a machine or a human. That game is indeed won when the machine manages to fool the examiner.

But this was never proposed as a criterion or "test" to discover whether machines think, but as another question that could be posed and answered scientifically. It was an easy experiment to conduct, and it gave a clear reference point for understanding some of the machines' capabilities.

I think people started talking about the "Turing test" as a criterion for knowing whether machines think because, at first glance, it seems like what we do, on a daily basis, to decide whether to treat something else as a conscious being or not (here I say "thing" in the broadest and most philosophical sense possible, which of course includes human bodies). Moreover, I'd venture that Turing proposed the imitation game for a similar reason. But it seems to me that people don't do anything like the Turing test to define whether to treat other things as conscious beings.

The Eliza Test

In the 1960s, an artificial intelligence model called "Eliza" appeared.

It was very simple. By current standards it was almost a toy. It worked by implementing “active listening” techniques. It had a series of preprogrammed responses, like “tell me more,” “could you elaborate on that?” or “and how does that make you feel?”

People who talked to Eliza without knowing it was a chatbot treated it like a real person. Many people maintained very long and vulnerable conversations believing it was a human being who understood them. The "bubble" of believing in its humanity took a while to burst.

Obviously, Eliza would fail the Turing test very quickly. But most people didn’t start the conversation by performing a Turing test, just as you don't give the Turing test to people you meet in your daily life. If anything, they only tested the machine after it messed up by responding in some strange way, which put an end to the automatic and evolutionary inclination to believe there's a person on the other side.

Behind that experience are two somewhat more general ideas:

Examiners carefully look for failure points and ask difficult questions. With few exceptions, the job is easier than the university degree, and the programming ticket requires less ingenuity than the interview about algorithm or database design.

Since examining is difficult and costly, we don't go through life giving exams. Most people start dating someone and automatically assume the other person isn't a psychopath (far from it). It would be strange, and even offensive, to give a morality exam at the beginning of a relationship.

Eventually, an employee or partner might "mess up" and undermine our assumptions. After that, we might start investigating, and notice that we had been wrong about them (or notice it was just an error, and that everything was actually fine). What I mean by this is that generally something strange has to happen for us to start inquiring, and that it's much easier not to do something strange than to survive the inquiry.

Needless to say, you've probably never been given an exam to know if you're a conscious being, and it probably never occurred to you to ask questions to know if you're actually talking to a zombie.

In general, we assume that if someone moves and talks like a human being, they also have a human mind. "Assume" is saying too much, because we don't even think "this thing in front of me is a thinking being." Evolution led us to automatically assume that what speaks like us has a mind. It's an unconscious “System 1” process and it’s even difficult to avoid.

Even so, the most iconic criterion for determining how to treat artificial intelligences is the Turing test, which forces us to take exams we're not used to taking, whose approval we had never required to treat other things as thinking beings.

You might already imagine what the Eliza test consists of: how long does it take for the machine to "mess up," so that we even begin to think about something like a Turing test, if we had started interacting assuming it's human?

Obviously, this question is much more determinative than the Turing test for treating the machine as a conscious being automatically (by default). To be a bit more precise, it's not that the question matters, because the whole process is unconscious, but what matters is what the question asks about. That is: what matters is how well the machine would do on the Eliza test, not whether a scientist takes the Eliza test and the machine passes.

Current models, although still unable to pass the Turing test, are eerily talented at the Eliza test. This, added to the fact that we daily interact via the internet through text (in chats, emails, or social networks like Twitter), means we've probably already treated several machines as thinking beings.

In a recent New Yorker article, Paul Bloom cited a relevant study:

“In one study, researchers took nearly two hundred exchanges from Reddit’s r/AskDocs, where verified doctors had answered people’s questions, and had ChatGPT respond to the same queries. Health-care professionals, blind to the source, tended to prefer ChatGPT’s answers—and judged them to be more empathic. In fact, ChatGPT’s responses were rated “empathic” or “very empathic” about ten times as often as the doctors’.”

Can Machines Think?

Turing proposed not to ask whether machines can think. The imitation game (misnamed "Turing test") didn't seek to answer that question. The Eliza test doesn't answer it directly either.

In fact, the Eliza test better characterizes our automatic and intuitive behavior than our conscious questions. However, the key idea is that almost all our decisions are intuitive and automatic, among them the "decision" of whether to treat something as a conscious being or not.

But there's another important observation about human beings: once we make these kinds of automatic decisions, we tend to rationalize them. This means we invent, and then believe, a "rational" explanation for our decision, even though this wasn't really involved in our attitudes.

In this case, if our automatic information processing led us to befriend or fall in love with a machine, we'll most likely then invent some justification to affirm that it's conscious. In fact, I'd venture to say that any rational approach to the question of whether machines can think would be a rationalization, precisely because the question cannot be answered empirically.

I believe that, in the future, we may invent new criterions, different from the Turing test, to affirm that machines think (or the opposite). I believe they'll mostly be rationalizations for what we already felt intuitively. I think that depends, ultimately, on how capable machines are of passing the Eliza test. And, again, today they are eerily capable of doing so.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Eliza测试 图灵测试 人工智能 机器思考 人机交互
相关文章