少点错误 10月27日 04:48
信息理论视角下的中介变量冗余定理
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了John Wentworth和David Lorell提出的“中介变量决定冗余”定理,并将其从图模型语言转换为信息论语言。新定理假设随机变量集合中存在中介变量Λ和冗余变量Λ′,通过限定条件下的互信息和熵约束,推导出Λ′的熵上界。与原定理相比,信息论版本在表述上更简洁,但两者通过d分离准则和互信息零等价性保持等价性。该工作有助于深入理解变量间信息依赖关系,为因果结构分析提供新视角。

📊 新定理在信息论框架下重新表述了中介变量与冗余变量的关系,假设变量集合包含Λ(中介)和Λ′(冗余),通过限定条件下的互信息和熵约束推导出Λ′的熵上界。

🔹 定理的核心条件包括:Λ对A和B子集的信息传递限制(I(XA:XB|Λ)≤ϵmed),冗余变量对子集的独立性限制(H(Λ′|XA)≤ϵred),以及信息论不等式的应用。

🔄 证明过程通过条件熵和互信息的分解,结合Redundancy(冗余)和Mediation(中介)假设,最终推导出H(Λ′|Λ)≤ϵmed+2ϵred的结论,展示了变量间信息依赖的量化关系。

🌐 该工作将原始图模型表述转换为信息论语言,利用d分离准则和互信息零等价性证明两者等价性,为变量依赖分析提供新工具。

🔬 通过将随机变量替换为其具体实例,该推导可扩展至Kolmogorov复杂度和算法互信息,适用于更广泛的概率模型和因果结构分析。

Published on October 26, 2025 8:33 PM GMT

This post is a comment on Natural Latents: Latent Variables Stable Across Ontologies by John Wentworth and David Lorell. It assumes some familiarity with that work and does not attempt to explain it. Instead, I present an alternative proof that was developed as an exercise to aid my own understanding. While the original theorem and proof are written in the language of graphical models, mine instead uses the language of information theory. My proof has the advantage of being algebraically succinct, while theirs has the advantage of developing the machinery to work directly with causal structures. Very often, seeing multiple explanations of a fact helps us understand it, so I hope someone finds this post useful.

Specifically, we are concerned with their Theorem 1 (Mediator Determines Redund): both the older Iliad 1 version for stochastic latents, and the newer arXiv version for deterministic latents. I will translate each theorem into the language of information theory: Wentworth & Lorell's assumptions will imply mine, while their conclusions will be equivalent to mine. The equivalences follow from the d-separation criterion and the fact that independence is equivalent to zero mutual information.

In our version of the new theorem, Λ is a mediator between subsets A and B of the data, meaning that it contains essentially all of the information in common between A and B, whereas Λ is a redund between A and B, meaning it essentially only contains information that is in common between A and B.[1]

New Theorem 1 (deterministic latents)

Let A,B be disjoint subsets of {1,...,n}.

Suppose the random variables X1,Xn,Λ,Λ satisfy the following:

Λ Mediation: I(XA:XBΛ)ϵmed,

Λ Redundancy: H(ΛXA)ϵred and H(ΛXB)ϵred.

Then, H(ΛΛ)ϵmed+2ϵred.

Proof

H(ΛΛ)

=H(ΛXB,Λ)+I(Λ:XBΛ)            by definition of conditional mutual information,

H(ΛXB)+I(XA:XBΛ)+H(ΛXA)             by information theory inequalities,

ϵmed+2ϵred                                                                       by Redundancy and Mediation.

Old Theorem 1 (stochastic latents)

Suppose the random variables X1,Xn,Λ,Λ satisfy the following:

Independent Latents: I(Λ:ΛX)ϵind,

Λ Mediation: I(Xj:XjΛ)ϵmed for all j,

Λ Redundancy: I(Λ:Xj Xj)ϵred for all j.

Then, I(Λ:XΛ)n(ϵind+ϵmed+ϵred).

Proof

First, we have

I(Λ:XjXj)I(Λ:XjΛ,Xj)

=I(Λ:Xj:ΛXj)                                  by definition of 3-way interaction information,

=I(Λ:Λ:XjXj)                                  by symmetry of 3-way interaction information,

=I(Λ:ΛXj)I(Λ:ΛXj,Xj)

I(Λ:ΛXj,Xj)

ϵind                                                            by Independent Latents.

Therefore,

I(Λ:XjΛ)

I((Λ,Xj):XjΛ)

=I(Xj:XjΛ)+I(Λ:XjΛ,Xj)           by mutual information chain rule,

I(Xj:XjΛ)+I(Λ:XjXj)+ϵind     by the above derivation,

ϵind+ϵmed+ϵred                                                by Mediation and Redundancy.

The result now follows by summing over all j=1,...,n.

  1. ^

    Since probabilistic models are often only defined in terms of a latent structure, you might find it philosophically suspect to impose a joint distribution on all variables including the latents. If so, feel free to replace the random variables with their specific instantiations: the derivations go through almost identically with Kolmogorov complexity and algorithmic mutual information replacing the Shannon entropy and mutual information, respectively.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

信息论 中介变量 冗余变量 d分离准则 因果结构分析
相关文章