有名確率分布18種類のチートシート
分布名 | 平均 | 分散 | 形状 | 一言メモ |
---|---|---|---|---|
Normal | $$\mu$$ | $$\sigma^2$$ | Continuous | よく使われる"ベル曲線" |
Binomial | $$n \times p$$ | $$n \times p \times (1 - p)$$ | Discrete | 成功回数の分布 |
Poisson | $$\lambda$$ | $$\lambda$$ | Discrete | 希なイベントの回数 |
Exponential | $$1/\lambda$$ | $$1/\lambda^2$$ | Continuous | イベント間の待ち時間 |
Uniform | $$(a + b) / 2$$ | $$(b - a)^2 / 12$$ | Continuous | 等確率で値が発生 |
Gamma | $$\alpha \times \beta$$ | $$\alpha \times \beta^2$$ | Continuous | イベントの発生までの時間 |
Beta | $$\alpha / (\alpha + \beta)$$ | $$\alpha \beta / ((\alpha + \beta)^2 (\alpha + \beta + 1))$$ | Continuous | 確率の確率 |
Geometric | $$1/p$$ | $$(1-p)/p^2$$ | Discrete | 初めて成功するまでの試行数 |
Negative Binomial | $$r/p$$ | $$r(1-p)/p^2$$ | Discrete | $$r$$回成功するまでの試行数 |
Hypergeometric | $$N_1n/N$$ | $$N_1N_2n(N-n)/(N^2(N-1))$$ | Discrete | 無戻し抽出 |
Bernoulli | $$p$$ | $$p(1-p)$$ | Discrete | 成功か失敗 |
Log-Normal | $$e^{(\mu+\sigma^2/2)}$$ | $$(e^{\sigma^2}-1)e^{2\mu+\sigma^2}$$ | Continuous | 正の値、対数が正規分布 |
Weibull | $$\beta \Gamma(1+1/\alpha)$$ | $$\beta^2[\Gamma(1+2/\alpha) - (\Gamma(1+1/\alpha))^2]$$ | Continuous | 耐久性・寿命 |
Cauchy | Undefined | Undefined | Continuous | 平均・分散なし |
Pareto | $$\alpha x_m / (\alpha - 1)$$ for $$\alpha > 1$$ | $$\alpha x_m^2 / ((\alpha - 1)^2 (\alpha - 2))$$ for $$\alpha > 2$$ | Continuous | "80-20"ルール |
Chi-Squared | $$k$$ | $$2k$$ | Continuous | 自由度$$k$$ |
Student's t | $$0$$ if $$\nu > 1$$ | $$\nu/(\nu - 2)$$ if $$\nu > 2$$ | Continuous | 正規分布の代わりに使われる |
F-distribution | $$\nu_2 (\nu_1 - 2) / (\nu_1 (\nu_2 - 2))$$ for $$\nu_1 > 2, \nu_2 > 4$$ | Varies | Continuous | 分散比のテスト |
Pythonによる可視化
# Import necessary libraries
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
# Create subplots
fig, axs = plt.subplots(6, 3, figsize=(18, 24))
# List of distributions to plot
distributions = [
('Normal', stats.norm(loc=0, scale=1)),
('Binomial', stats.binom(n=10, p=0.5)),
('Poisson', stats.poisson(mu=3)),
('Exponential', stats.expon(scale=1)),
('Uniform', stats.uniform(loc=0, scale=10)),
('Gamma', stats.gamma(a=2, scale=1)),
('Beta', stats.beta(a=2, b=5, loc=0, scale=1)),
('Geometric', stats.geom(p=0.5)),
('Negative Binomial', stats.nbinom(n=10, p=0.5)),
('Hypergeometric', stats.hypergeom(M=20, n=7, N=12)),
('Bernoulli', stats.bernoulli(p=0.6)),
('Log-Normal', stats.lognorm(s=0.954, loc=0, scale=np.exp(0.65))),
('Weibull', stats.weibull_min(c=1.79, loc=0, scale=1)),
('Cauchy', stats.cauchy(loc=0, scale=1)),
('Pareto', stats.pareto(b=2.62)),
('Chi-Squared', stats.chi2(df=5)),
('Student\'s t', stats.t(df=10)),
('F-distribution', stats.f(dfn=29, dfd=18))
]
# Plot each distribution
for ax, (name, dist) in zip(axs.flatten(), distributions):
if name == 'Bernoulli':
x = np.array([0, 1])
ax.bar(x, dist.pmf(x), alpha=0.7, label='pmf')
elif name in ['Binomial', 'Poisson', 'Geometric', 'Negative Binomial', 'Hypergeometric']:
x = np.arange(0, 11)
ax.bar(x, dist.pmf(x), alpha=0.7, label='pmf')
else:
x = np.linspace(dist.ppf(0.01), dist.ppf(0.99), 100)
ax.plot(x, dist.pdf(x), label='pdf')
ax.set_title(name)
# Adjust layout to prevent overlap
plt.tight_layout()
# Show the plot
plt.show()