More than 3 years have passed since last update.

ベイズ推論の考え方（１）・・事前確率と事後確率

Last updated at 2020-08-13Posted at 2020-08-09

Pythonで体験するベイズ推論

ベイズ推定の考え方・・例題：司書か農家か

夏休みなので、「Pythonで体験するベイズ推論」を少しずつ勉強しています。簡単に勉強したことをPythonとともに書いていこうと思います。

問題
スティーブは親切だが内向的である。他人にはほとんど関心がなく順番通りにやることが好き。スティーブは図書館司書になるか？農家になるか？

男性の農業家と男性の司書の比率は２０：１（これを事前確率という）

スティーブが司書である・・A
スティーブが農家である・・B

スティーブが司書である事前確率　$𝑃(𝐴)$　は、

\displaystyle𝑃(𝐴)=1/21=0.047

になる。
スティーブの性格についての情報（性格が内向的）が入ったことを$X$とする。
ここで考えるのは、$P(A | X )$である。これは、$X$という情報がはいった条件の元でのスティーブが司書である事後確率である。
ベイズの定理の式から考えます。
ベイズの定理の式


P( A | X ) = \displaystyle \frac{ P(X | A) P(A) } {P(X) }

近所の住人がスティーブの性格を内向的であると語る確率$P(X|A)$を0.95とする。
次に$𝑃(X)$を考える。これは、誰かが近所の住人に内向的であると言われる確率である。これは分解して考える。

\begin{align}
P(X ) &= P(X \text{ and } A) + P(X \text{ and } \sim A)\\
&= P(X|A)P(A) + P(X | \sim A)P(\sim A)\\
&= P(X|A)p + P(X | \sim A)(1-p)\\
\\
&*P(\sim A)=P(notA)

\end{align}

1行目はこういう意味である。
スティーブが内向的$P(X)$で司書$P(A)$である確率と内向的$P(X)$で司書でない$P(\sim A)$（農家である）確率を足したもの。つまりスティーブの職業が何であれ（今回は司書か農家か）内向的な確率の総和である。

ここまでで、わかっているのは、$P(X| A)=0.95$と$P(A)=0.047$である。また、$P(\sim A)$は、$P(\sim A)=1-P(A)=20/21$である。
後、わかっていないのは、$P(X | \sim A)$である。これは彼が農家である場合に近所の人が内向的と語る確率である。ここは設定するしかないようで、一応$0.5$とする。これで、$P(X)$がわかる。

\begin{align}
P(X ) &= P(X \text{ and } A) + P(X \text{ and } \sim A)\\
&= 0.95*1/21+0.5*20/21\\
&= 0.52
\end{align}

これで、$P(X)$がわかったので、ここから、ベイズの定理で、$P(A | X )$がわかる

\begin{align}
P( A | X ) &= \displaystyle \frac{ P(X | A) P(A) } {P(X) }\\
&= \displaystyle \frac{0.95*1/21}{0.52}\\
&= 0.087
\end{align}

つまり、スティーブの性格の情報がない場合のスティーブが司書である確率$𝑃(𝐴)=0.047$であったのが、スティーブの性格が内向的としった後にスティーブが司書であると考える確率は$𝑃(A | X )=0.087$と倍増しているのである。

これら事前確率と事後確率の関係とかは、基本となるのでしっかり覚えておきたいと思う。
この例題は、本に載っていたのですが、僕は今までのいろいろ読んだ中で一番わかりやすかったと思っています。

ここで、Pythonのスクリプトです。
jupyter notebook形式でgithubリポジトリにあります。
https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

でも、よく見たら、この最初の例題のPythonスクリプトは、グラフを書くだけのスクリプトでした。。


# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt 

colours = ["#348ABD", "#A60628"]

prior = [0.20, 0.80]
posterior = [1./3, 2./3]
plt.bar([0, .7], prior, alpha=0.70, width=0.25,
        color=colours[0], label="prior distribution",
        lw="3", edgecolor=colours[0])

plt.bar([0+0.25, .7+0.25], posterior, alpha=0.7,
        width=0.25, color=colours[1],
        label="posterior distribution",
        lw="3", edgecolor=colours[1])

plt.xticks([0.20, .95], ["Bugs Absent", "Bugs Present"])
plt.title("Prior and Posterior probability of bugs present")
plt.ylabel("Probability")
plt.legend(loc="upper left")
plt.show()

次の記事です。
「ベイズ推論の考え方（２）・・ベイズ推定と確率分布」

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up