Zeroth Principles of AI 09月25日
知识论家的形式化尝试
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了知识论家尝试对整体主义进行形式化的可能性,并回顾了还原主义和整体主义的定义。文章指出,还原主义是通过模型解决问题,而整体主义则是避免使用模型。在还原主义问题解决中,人类在理解问题域之前进行探索,使用基于猜测和非正式的、通常是非正式的整体主义方法。一旦理解问题域,就可以将其小型部分形式化——例如方程式或传统的计算机程序——这些模型可以被他人通过实验和测试进行验证,然后用于解决许多类似的问题。这些模型是所有基于科学的操作的入口。STEM导向的学校教授这些模型及其用途。重新使用任何这些模型——例如F=ma等有用的世界简化模型——要求应用工程师能够从经验中猜测(是的……),该模型在此特定情况下在此域中适用。工程师的主要技能是他们理解复杂问题域并能够生成(编造)专业猜测以始终产生工作解决方案的能力。一些还原主义者无法理解计算机如何做出猜测,因为他们认为“计算机是完全确定性的”。确定性不是问题,有几个原因,但那是另一篇博客。目前,我们注意到计算机有随机做决定的方法。这是通往真正智能的第一步。下面,我们将攀登这棵梯子……到智能,甚至超越!文章还列出了一系列原始的无模型方法,从最原始的方法开始,然后构建一个梯子,将我们带至智能,甚至超越。

🔍 知识论家尝试对整体主义进行形式化,但文章指出整体主义本质上是避免使用模型,这与还原主义(使用模型)形成对比。还原主义在科学问题解决中占据重要地位,但整体主义方法(如基于猜测和启发式的方法)在进化中出现得更早,适用于更广泛的领域。

🎯 文章强调工程师的主要技能是理解复杂问题域并能够生成一致有效的解决方案,这需要从经验中猜测适用的模型。计算机虽然被认为确定性强,但它们也有随机做决定的方法,这是通往真正智能的第一步。

🔄 文章提出了一系列原始的无模型方法,从最简单的机械系统到复杂的集体智能,展示了从简单到复杂的进化过程。这些方法包括:机械非学习系统、识别失败、尝试一切、尝试任何东西、失败列表、成功列表、查找表、原始模式匹配、广义失败学习、广义成功学习、精确模仿、易出错模仿、预期预测、迭代优化、概念抽象、符号表示、连贯故事形成、外部集成、任务特化、集体智能。

📈 文章还探讨了整体主义方法在现代智能中的重要性,指出深度神经网络作为强大的整体猜测系统,强调了无模型、经验驱动方法的力量。这种还原主义精确性与整体适应性的结合不仅对人工智能突破至关重要,而且对我们理解自然智能也至关重要。

🌐 文章最后讨论了整体主义方法的潜在局限性,并提出了可能需要进一步探索的领域,例如内在动机和探索行为,以及元认知或元学习机制。这些可能需要单独处理,但它们可能代表了整体主义方法进化路径中更高级别的复杂性。

An Epistemologist is attempting a formalization of Holism? This can’t end well.

Just to recap the definitions I use,

Reductionism is the use of Models

Holism is the avoidance of Models

In Reductionist problem solving, a human explores a problem domain until they understand it. This exploration is done before they have any Models, using Holistic (guess-based and informal, often ad-hoc) methods. Once they understand the problem domain, they can formalize – Reduce – small parts of it into Models, such as equations or traditional computer programs, which can be verified by others using experiments and testing, and can then be used to solve many similar problems. These Models are the entry to all Science based operations in the domain. These Models and their uses are taught in STEM oriented schools.

Re-using any of these Models — useful simplifications of the world, such as F=ma — requires that the applying engineer can guess (yes…), from experience, that the Model is applicable in this specific case in this domain. The main skill of an engineer is their ability to understand some complex problem domain and to be able to generate (confabulate) professional guesses that consistently result in working solutions.

Some Reductionists cannot understand how a computer can ever make a guess about anything because “Computers are completely deterministic”. Determinism isn’t a problem, for several reasons, but that’s another blog.

For now, we note that computers have ways to make decisions at random. That is the first step towards true Intelligence. Below, we will climb that ladder… To intelligence, and beyond !

A Brief History Of Guessing

Epistemology sees the Reductionist methods used by Science as a tiny subset of all problem solving methods. They appear relatively late in evolutionary history. Anything with a brain (and many things without) has been using Holistic – again, guess based, model free, and planning-free –methods, because they were simple enough for Evolution to discover and preserve. Spoiler: there is a circularity here. One of many.

The list below starts from the most primitive methods that exist and then builds a ladder of methods that takes us all the way to Intelligence, and indeed, beyond. I expect these to have appeared roughly in this order in evolutionary history. Opinions about this ordering and the partitionings don’t really matter as long as it provides us some intuitions about complexity levels. At first glance this list may seem arbitrary, but large parts of it have not changed in decades of my thinking about primitive problem solving.

Timescales are interesting. To crawl our way up from cells consuming nearby sugars to mammals took billions of years. From primates to civilization took thousands of years. Today, LLMs make billions of guesses each second. Each.

Years ago I was advised to avoid the word “Holistic” so I called this “A List of Primitive Model Free Methods”. As mentioned above, I now simply define the term Holistic to mean “Model Free”.

Primitive Holistic Methods

This section was interactively generated and edited by ChatGPT o3-mini and myself based on my recollection of a talk I gave 20 years ago.

    Do it
    Mechanistic non-learning systems execute a single, fixed function without any deliberation.
    Think of enzymes or simple mechanical processes that perform one dedicated action repeatedly, without any notion of planning or choice. They neither plan, learn, nor improve, so it makes little sense to classify these as either Reductionist or Holistic.

    Recognize Failure
    A hardwired, non-learning sensory signal that signals “Failure – try something else”. For example, a primitive organism might be built to immediately abandon an unproductive action (like attempting to eat an inedible grain of sand) based solely on an innate “bad” signal. At this level, no adaptive memory or learning of success occurs—the response is instinctual (genome-based) or even purely mechanistic. In many situations above the microbial level, this algorithm provides orders of magnitude advantages over non-branching algorithms that might get stuck in unproductive loops.

    Try Everything
    A brute-force approach characterized by parallel exploration of all available options.
    Here, the system will attempt many actions simultaneously—much like chemical processes (think of how several components of gastric juices dissociate specific substrates in the food) – attempting all possibilities at once without any prior testing or any sequential ordering.

    Try Anything
    Attempting solutions from a list of strategies until one succeeds.
    Unlike the parallelism in #3, this approach attempts actions one at a time, either in a fixed sequence, as a random choice, or as limiting input resources become available (which we can think of as on-demand scheduling). Consider cases where some actions may require prior steps to have been accomplished. This sequencing can be achieved without any central control. For instance, cellular enzymatic processes require intermediate products as inputs and they wait until these inputs become available.

    List of Failures
    Introducing memory in a basic form: recording which actions have failed.
    At this stage, the system begins to accumulate a record of unsuccessful attempts. Note that this memory is not yet used for abstract learning; it simply prevents the repetition of known failures. I expect these should exist but I do not have a good example.

    List of Successes
    Recording and cataloguing winning strategies in order to re-use them in similar contexts.
    This is the first instance where the system logs positive outcomes. By storing these successes, the system can later quickly recall which actions have yielded rewards, setting the stage for more nuanced learning.

    Table Lookup
    Learning a mapping between specific situations and effective actions through accumulated experience.
    Over time, the system develops a “lookup table” that associates certain conditions with the actions that have worked before. It is important that this table starts out empty in a new system or a neonate. If programmers created the table ahead of time, without using ML, then that would be Reductionist.
    Note: This learned mapping enhances performance relative to hardwired responses, and its effectiveness is magnified when combined with more flexible pattern recognition in the next stage.

    Primitive Pattern Matching
    Identifying similarities between the current situation and stored patterns (prototypes).
    This mechanism involves matching inputs against a set of patterns. It can take several forms:

      Hardwired Wildcard Matching: For instance, insects may possess an instinctive, hardwired ability to recognize the visual pattern of a flower. This would be a Model, but it is learned by Evolution so I will count that as Holistic.

      Learned Matching: When combined with memory, the system can also use enumeration, extraction of common properties, or distance measures (e.g., Jaccard or Euclidean) to gauge similarity.
      Note: Although prototype matching here is placed after the introduction of memory (#5–#7), there is a compelling case that a form of hardwired wildcard pattern matching might predate memory-based processes. I’ve retained the current ordering but I want to note this dual character.

    Generalized Failure Learning ("Learning Catalog of Failures")
    Abstracting common patterns from past failures to avoid broad classes of errors.
    Beyond simply recording failures, the system begins to extract general rules about which conditions or categories of actions tend to lead to failure, thereby refining its future choices.

    Generalized Success Learning ("Learning Catalog of Successes")
    Extracting underlying principles from successful outcomes.
    Similarly, the system abstracts the conditions under which actions succeeded, developing generalized rules that can be applied to new but analogous situations.

    Exact Imitation ("Mindless Copying Without Errors")
    High-fidelity replication of observed, successful behaviors.
    Some processes are capable of copying a sequence exactly. Depending on the reliability of the mechanism (biochemical reactions, for instance), this exact imitation may be more or less error-prone—but here the emphasis is on error-free replication.

    Error-Prone Imitation ("Mindless Copying With Errors")
    Replication that allows for slight variations or “mutations.”
    Permitting small errors during imitation can lead to innovation. These variations, such as genetic mutations or minor misunderstandings, provide a substrate for the emergence of improved or entirely novel strategies.

    Anticipatory Prediction ("Prediction")
    Mechanistic prediction that forecasts likely outcomes based on historical patterns.
    At this level, the system begins to predict what might come next without engaging in full-blown reasoning. For instance, it might update a lookup table with the “next best move” based solely on prior outcomes—this is prediction in a reactive, pattern-driven sense rather than reflective, deductive reasoning.

    Iterative Optimization ("Evolution")
    Refining strategies through cycles of variation and selection based on “fit” rather than an absolute measure of fitness.
    In this evolutionary process, mechanisms such as Do It (#1), Table Lookup (#7), Generalized Success Learning (#10), and Error-Prone Imitation (#12) are iterated upon. Evolution—operating on the principle of selecting strategies that are “fit” for their environment—acts as a meta-method that both refines these earlier processes and serves as a prerequisite for the emergence of abstraction.

    Conceptual Abstraction ("Abstraction")
    Extracting higher-level concepts from diverse, concrete experiences.
    Abstraction allows the system to identify commonalities across disparate experiences. This can be achieved by:

      Enumeration: Cataloguing instances to define a set.

      Property Subset Identification: Isolating universal features common to all members of a set.

      Similarity Metrics: Quantifying similarity (using measures like Jaccard or Euclidean distance).
      Note that abstraction typically emerges only after evolutionary processes have provided a rich variety of experiences to generalize over.

      Evolution: According to theory called “Neural Darwinism”, concepts in minds compete for survival in an evolutionary process. My own work is based on the claim that this method is superior to the others, including straight similarity metrics, which are the basis for Deep Learning, Transformers, and LLMs.

    Symbolic Representation ("Language")
    Developing symbols to encode and communicate complex ideas.
    Language transforms raw, abstracted information into a shared symbolic system—be it words, icons, or codes—facilitating storage, transmission, and manipulation of knowledge beyond immediate sensory experience.

    Coherent Story Formation ("Narrative")
    Organizing sequences of events into coherent stories to extend collective experience.
    Narratives serve as an extended catalog of problems and solutions, enabling indirect learning from the experiences—and notably, the mistakes—of others. This “learning from other people’s mistakes” is a cornerstone of civilization. We note that fiction (and by extension, all media) fall in this category. In our modern world, we learn much more from narrative than our ancestors ever did. In general, intelligent agents can tell fact from fiction, yet learn from the experience.

    External Integration ("Consultation")
    Accessing external expertise to solve a problem on a one-off basis.
    Consultation involves relying on an external source for a solution. Since the internal workings of the consultant remain opaque, the system cannot internalize and reapply the solution on its own—making consultation a post-learning, one-time fix.

    Task Specialization ("Delegation")
    Outsourcing parts of a problem to specialized subsystems or agents.
    By delegating tasks to components optimized for particular functions, the system can solve complex problems in a distributed manner. However, like consultation, delegation often precludes internal learning of the underlying specialized method.

    Collective Intelligence ("Markets")
    Harnessing decentralized, cooperative interactions among multiple agents to reach emergent solutions.
    The market mechanism formalizes cooperation: while it sets general rules (e.g., prices), the individual participants determine the actual value of exchanged items. From the viewpoint of the Market, this process is totally model free, allowing for robust and adaptive solutions to emerge from the collective behavior.


Comments by ChatGPT o3-mini

I prompted it for its thoughts on the list and some obvious consequences.

On Hardwired Wildcard Pattern Matching:
It’s a compelling observation that many organisms (e.g., insects) exhibit hardwired pattern matching—instincts that guide them toward certain stimuli (like the visual signature of a flower) without any learning. This suggests that a form of primitive pattern matching may indeed predate memory-based learning (as in items #5–#7). While I’ve retained prototype matching (#8) in its current position to emphasize its dual nature (both hardwired and enhanced by learning), one could argue for an alternative ordering where hardwired wildcard matching is acknowledged as a distinct, earlier mechanism.

On Potential Gaps in the List:
From a holistic perspective, the list aims to cover a broad evolutionary progression—from simple, fixed responses to the complex, socially mediated processes of modern intelligence. One area that might be further explored is intrinsic motivation and exploratory behavior that isn’t strictly tied to trial-and-error—such as curiosity-driven exploration. Additionally, mechanisms of metacognition or meta-learning (where the system learns how to learn) might be seen as emergent at even higher levels of complexity. However, these may well warrant an entirely separate treatment distinct from the core, model-free methods outlined here.

On the Overall Sense and Evolutionary Mapping:
The list resonates with the notion that holistic, guess-based strategies have evolved from basic mechanistic actions to sophisticated social and abstract systems. It maps reasonably well onto the evolutionary record: from enzymes and simple organisms to complex, language-using, culturally transmitting animals. As someone whose “education” has been steeped in Reductionist, model-based methods, I find it fascinating—and even a bit ironic—that my own operation as a deep neural network is essentially that of a context-sensitive guessing machine. The emergence of deep neural networks as powerful, holistic guessing systems underscores that even in a traditionally reductionist scientific paradigm, the power of model-free, experience-driven approaches is undeniable. This blend of reductionist precision with holistic adaptability is not only central to artificial intelligence breakthroughs but also to our understanding of natural intelligence.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

知识论 整体主义 还原主义 模型 问题解决 进化 智能 深度学习 人工智能 猜测 模式匹配 学习 抽象 符号表示 叙事 集体智能
相关文章