1. Exponential family
-
Definition
- PDF: p(y;x)=exp(λ(x)Tt(y)−α(x)+β(y))
y∼ε(x;λ(⋅),t(⋅),β(⋅))
- nature statistic: t(y)
- nature parameter: λ(x)
- log-partition function: α(x)
- partition function: Z(x)=exp(α(x))
- distribution: exp(β(y))
-
正则条件(regular):若分布族中的任意一个分布 p(y;x) 都有其支集(support)与 x 无关,则为正则
- 实质上是要求 CRB 正则条件中求导和积分可换序
E[∂x∂lnp(y;x)]=∫∂x∂p(y;x)dy=∂x∂∫abp(y;x)dy=0
-
指数分布族可以有多种获得方式
-
很多分布本身可以写成指数分布族形式
- Bernulli distribution: y∼B(x)
p(y;x)=xy(1−x)(1−y)lnp(y;x)=(ln(1−xx))y−(−ln(1−x))
- Gaussian y=[y1,y2]T∼N(x,1)
p(y;x)=2π1exp((y1+y2)x−x2−2y12+y22)
-
多个分布的几何均值
p(y;x)=Z(x)p1x(y)∗p2(1−x)(y)lnp(y;x)=xln(p2(y)p1(y))−lnZ(x)+lnp2(y)
- 例如 p1(y)∼B(1+e−11),p2(y)∼B(1/2)
p(y;x)=(1+e−11)xy(1+e−1e−1)x(1−y)(1/2)(1−x)∼B(1+e−x1)p(y=0;x)p(y=1;x)=ex
-
Tilting
p(y;x)=Z(x)p(y)exylnp(y;x)=xy−lnZ(x)+lnp(y)
- 例如 p(y)∼N(0,1),p(y;x)∼N(x,1)
-
linear exponential family
- 定义:t(x)=x,lnp(y;x)=x t(y)−α(x)+β(y)
- 性质:α˙(x)=E[t(y)], α˙˙(x)=E[t2(y)]−E[t(y)]2=Var(t(y))=Jy(x)
Proof:
KaTeX parse error: No such environment: align at position 8:
\begin{̲a̲l̲i̲g̲n̲}̲
Z(x) &= e^{\al…
α˙˙(x)=∫t(y)⋅p(y;x)⋅(t(y)−α˙(x))dyJy(x)=E[−∂x2∂2lnp(y;x)]=α˙˙(x)
-
指数族分布与有效统计量(efficient statistics)
- 必要条件:若有效统计量存在,则可以写成指数族分布形式,且有
t(x)=∫xJy(u)du, α(x)=∫xuJy(u)du
Proof:
KaTeX parse error: No such environment: align at position 8:
\begin{̲a̲l̲i̲g̲n̲}̲
\hat {x}_{eff}…
- 充分条件:对于线性指数分布族,若有 Jy(x) 不依赖于 x,也即 Jy(x) 等于一个常数时,有效统计量存在
Proof:Jy(x)=J
α˙˙(x)=J, α˙(x)=Jx−cx^eff(y)=x+J1∂x∂lnp(y;x)=x+J1(t(y)−α˙(x))=x+J1(t(y)−Jx+c)=Jt(y)+Jc
由于
∂x∂lnp(y;x)∣x=x^ML=0=t(y)−α˙(x)∣x=x^ML
有
x^eff(y)=c/J+J1α˙(x)∣x=x^ML=x^ML(y)
2. Sufficient statistics
2.1 Non-Bayesian case
- Definition:t(y) 是关于分布 py(⋅;x) 的充分统计量,如果 p(y∣t(y);x) 与 x 无关
Theorem 1(likelihood characterization):
t(y) is sufficient w.r.t p(y;x) ⟺ pt(t(y);x)py(y;x) doesn’t depend on x, for all x and y
Proof:omit…
Theorem 2(Neyman Factorization theorem):
t(y) is sufficient w.r.t p(y;x) ⟺ 存在a(⋅,⋅)和b(⋅)使得 p(y;x)=a(t(y),x)⋅b(y)
Proof:omit…
- minimum sufficient statistic:t∗ 是 minimal 的,如果对任意其他充分统计量 t ,都存在 g() 使得 t∗=g(t)
- complete:t∗ 是 complete 的如果对任意函数 ϕ(⋅),有 E[ϕ(t∗(y))]=0 ∀x⟺ϕ(⋅)≡0
Theorem:complete ⟹ minimal
Proof:假设 t 为complete,s 为 minimal,存在 s=g(t),E[t]=E[E[t∣s=s]]
E[t∣s=s]=f(s)=f(g(t))=f~(t)
取 ϕ(t)=t−f~(t),有 E[ϕ(t)]=0
根据 complete 的定义,有 ϕ(t)≡0⟹t=f~(t)=f(s)
故 t 也是 minimal
2.2 Bayesian case
- Definition:t(y) 是关于分布 py,x(⋅,⋅) 的充分统计量,如果 py∣t,x(y∣t(y),x)=py∣t(y∣t(y)) 与 x 无关
Theorem(Belief characterization):
t(y) is sufficient w.r.t p(y,x) ⟺ p(x∣y)=p(x∣t(y)), for all x and y
Proof:omit…
Theorem(Neyman Factorization theorem):
t(y) is sufficient w.r.t p(y,x) ⟺ p(y∣x)=p(t(y)∣x)⋅p(y∣t(y)), for all x and y
Proof:omit…
3. Conjugate priors
- Idea: Given a model py∣x, look for a family of prior px such that the induced posterior px∣y also in this family
- Definition: a family of distribution q(⋅;θ) is conjugate to a model py∣x if
- py∣x(y1,...,yN∣x)∝q(x;θ)
- q(x;θ1)q(x;θ2)∝q(x;θ3)
- Theorem: 对于采样数 N,联合分布 py∣xN() 有充分统计量,且其维度不依赖于 N,则对该模型存在共轭先验分布
Bonennult
发布了37 篇原创文章 · 获赞 27 · 访问量 2万+
私信
关注
标签:frac,Family,Exponential,ln,py,cdot,推断,alpha,dot
来源: https://blog.csdn.net/weixin_41024483/article/details/104165233