EM アルゴリズムの数式メモ

Last updated at 2026-03-07Posted at 2026-03-01

EM Algorithm

EMアルゴリズム（Expectation-Maximization Algorithm）は、観測データに欠損がある場合や、潜在変数が存在する場合に、最尤推定量を求めるための反復アルゴリズムです。

観測データ: $X = {x_1, x_2, \dots, x_n}$
潜在変数（欠損データ）: $Z = {z_1, z_2, \dots, z_n}$
任意の確率分布 $\color{#42A5F5}{q(Z)}$ ただし、 $\sum_Z \color{#42A5F5}{q(Z)} = 1$
モデルパラメータ: $\theta$

目的は、観測データの対数尤度 $L(\theta) = \log \color{#EF5350}{p(X|\theta)}$ を
最大化する $\theta$ を見つけることです。
しかし、直接 $\log \color{#EF5350}{p(X|\theta)} = \log \sum_Z \color{#FFCA28}{p(X, Z|\theta)}$ を計算して微分することは、
対数の中に和があるため困難です。
このときEMアルゴリズムを使います。

\begin{aligned}
L(\theta) &:= \log \color{#EF5350}{p(X|\theta)}  & \text{(1) 定義} \\
&= \log \sum_Z \color{#FFCA28}{p(X, Z|\theta)} & \text{(2) 潜在変数 } Z \text{ による周辺化} \\
&= \log \sum_Z \color{#42A5F5}{q(Z)} \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#42A5F5}{q(Z)}} & \text{(3) } \color{#42A5F5}{q(Z)}/\color{#42A5F5}{q(Z)} \text{ を掛ける（重み付き平均の形へ）} \\
&\ge \sum_Z \color{#42A5F5}{q(Z)} \log \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#42A5F5}{q(Z)}} & \text{(4) イェンセンの不等式（ } \log \text{ の下に凸性による）} \\
&= \sum_Z \color{#42A5F5}{q(Z)} \log \color{#FFCA28}{p(X, Z|\theta)} - \sum_Z \color{#42A5F5}{q(Z)} \log \color{#42A5F5}{q(Z)} & \text{(5) 対数の性質による分解} \\
&:= \mathcal{L}(q, \theta) & \text{(6) 下界（ELBO）の定義}
\end{aligned}

E step

準備

E step 準備のための別解釈

\begin{aligned}
L(\theta) &:= \log \color{#EF5350}{p(X|\theta)}  & \text{(1) 対数尤度の定義} \\
&= \sum_Z \color{#42A5F5}{q(Z)} \log \color{#EF5350}{p(X|\theta)} & \text{(2) } \sum_Z \color{#42A5F5}{q(Z)} = 1 \text{ による変形} \\
&= \sum_Z \color{#42A5F5}{q(Z)} \log \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#66BB6A}{p(Z|X, \theta)}} & \text{(3) 乗法定理 } \color{#EF5350}{p(X|\theta)} = \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#66BB6A}{p(Z|X, \theta)}} \text{ の代入} \\
&= \sum_Z \color{#42A5F5}{q(Z)} \log \left( \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#42A5F5}{q(Z)}} \cdot \frac{\color{#42A5F5}{q(Z)}}{\color{#66BB6A}{p(Z|X, \theta)}} \right) & \text{(4) } \color{#42A5F5}{q(Z)}/\color{#42A5F5}{q(Z)} \text{ を掛ける} \\
&= \sum_Z \color{#42A5F5}{q(Z)} \log \frac{\color{#FFCA28}{p(X, Z|\theta)}}{\color{#42A5F5}{q(Z)}} + \sum_Z \color{#42A5F5}{q(Z)} \log \frac{\color{#42A5F5}{q(Z)}}{\color{#66BB6A}{p(Z|X, \theta)}} & \text{(5) 対数の性質による和への分解} \\
&= \mathcal{L}(q, \theta) + KL(\color{#42A5F5}{q(Z)} || \color{#66BB6A}{p(Z|X, \theta)}) & \text{(6) ELBOとKLダイバージェンスによる定義} \\
\end{aligned}

最大化

$\theta$ を固定して $q$ で $\mathcal{L}(q, \theta_t)$ を最大化

\begin{aligned}
q_{t+1} &= \arg \max_{q} \mathcal{L}(q, \theta_t) & \text{(1) 下界の最大化} \\
&= \arg \max_{q} \left[ \log \color{#EF5350}{p(X|\theta_t)} - KL(\color{#42A5F5}{q(Z)} \parallel \color{#66BB6A}{p(Z|X, \theta_t)}) \right] & \text{(2) 対数尤度とKL項への分解} \\
&= \arg \min_{q} KL(\color{#42A5F5}{q(Z)} \parallel \color{#66BB6A}{p(Z|X, \theta_t)}) & \text{(3) } q \text{ に依存しない定数項の除外} \\
&= p & \text{(4) KLダイバージェンスは分布一致で最小} \\
q_{t+1}(Z) &= \color{#66BB6A}{p(Z|X, \theta_t)} \\
\end{aligned}

M step

$q$ を固定して $\theta$ で $\mathcal{L}(q, \theta_t)$ を最大化

\begin{aligned}
\theta_{t+1} &= \arg \max_{\theta} \mathcal{L}(q_{t+1}, \theta) & \text{(1) 下界の最大化} \\
&= \arg \max_{\theta} \left[ \sum_Z \color{#42A5F5}{q_{t+1}(Z)} \log \color{#FFCA28}{p(X, Z|\theta)} - \sum_Z \color{#42A5F5}{q_{t+1}(Z)} \log \color{#42A5F5}{q_{t+1}(Z)} \right] & \text{(2) 下界の定義による展開} \\
&= \arg \max_{\theta} \left[ \sum_Z \color{#66BB6A}{p(Z|X, \theta_t)} \log \color{#FFCA28}{p(X, Z|\theta)} - \sum_Z \color{#66BB6A}{p(Z|X, \theta_t)} \log \color{#66BB6A}{p(Z|X, \theta_t)} \right] & \text{(3) } \color{#42A5F5}{q_{t+1}(Z)} = \color{#66BB6A}{p(Z|X, \theta_t)} \text{ を代入} \\
&= \arg \max_{\theta} \sum_Z \color{#66BB6A}{p(Z|X, \theta_t)} \log \color{FFFF00}{\color{#FFCA28}{p(X, Z|\theta)}} & \text{(4) 第2項は } \theta \text{ に依存しない定数であるため除外} \\
&:= \arg \max_{\theta} Q(\theta |\theta_t) & \text{(5) } Q \text{ 関数の定義} \\
\end{aligned}

$Q(\theta |\theta_t)$ は完全データの対数尤度の期待値。
具体的な分布が定まってから計算。

まとめ

$L(\theta) \ge \mathcal{L}(q, \theta)$
$q_{t+1} = \arg \max_{q} \mathcal{L}(q, \theta_t) = \color{#66BB6A}{p(Z|X, \theta_t)} $
$ \theta_{t+1} = \arg \max_{\theta} \mathcal{L}(q_{t+1}, \theta) = \arg \max_{\theta} Q(\theta |\theta_t) $

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

EM アルゴリズム の数式メモ

EM Algorithm

E step

準備

最大化

M step

まとめ

EM アルゴリズムの数式メモ