0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

【初心者向け】度数分布(Frequency Table)/度数分布図(Histogram)と最頻値(Mode)

Last updated at Posted at 2020-09-26

統計学における最頻値またはモードmode)の概念について、日本工業規格は「離散分布の場合は確率関数が,連続分布の場合は密度関数が,最大となる確率変数の値。分布が多峰性の場合は,それぞれの極大値を与える確率変数の値」と定めている様です。
最頻値 - Wikipedia

しばしば「データ群や確率分布で最も頻繁に出現する値」と表現されますが、実際にはデータ処理者が「大きな塊を掴み取りに行く」人間臭い側面も。そう、まさに度数分布(Frequency Table)概念の顕現たる度数分布図(Histogram)の作成過程がそうな訳です。
image.png

統計言語Rによる実現例

K <- 100
N <- 1000
pi.est <- c(NULL)
for (k in seq(1,K)) {
x <- runif(N, min=-1, max=1)
y <- runif(N, min=-1, max=1)
Data<-sum(sqrt(x*x+y*y))/N
pi.est<-c(Data,pi.est)
}
hist(pi.est, breaks=50)
rug(pi.est)

実際には最初の元データの中身はこんな感じだったりします。

#再現性確保の為のデータ保存
TestData01<-c(0.7575710,0.7626254,0.7720857,0.7636143,0.7559374,0.7636892 ,0.7697000,0.7744773,0.7718450,0.7661308,0.7643701,0.7773141,0.7601370,0.7625076,0.7657443,0.7642925,0.7652136,0.7626871,0.7651091,0.7838558,0.7577962,0.7679088,0.7691799,0.7778908,0.7651222,0.7745353,0.7630110,0.7704386,0.7693323,0.7626189,0.7589299,0.7469987,0.7606626,0.7728721,0.7608035,0.7766716,0.7748260,0.7754264,0.7671104,0.7588898,0.7763099,0.7590662,0.7713618,0.7706218,0.7752811,0.7783880,0.7727876,0.7694160,0.7689929,0.7661827,0.7569213,0.7565509,0.7594678,0.7557077,0.7661841,0.7831111,0.7671762,0.7670367,0.7456229,0.7633547,0.7806836,0.7537715,0.7705932,0.7695142,0.7594232,0.7629710,0.7571859,0.7698099,0.7585800,0.7565154,0.7619804,0.7723939,0.7621805,0.7769550,0.7552302,0.7477231,0.7605412,0.7620618,0.7601785,0.7479306,0.7770422,0.7799119,0.7709682,0.7657947,0.7612600,0.7643751,0.7653982,0.7593435,0.7823729,0.7659631,0.7529565,0.7603398,0.7592395,0.7607167,0.7566061,0.7573243,0.7640064,0.7661753,0.7667829,0.7758546)

#ヒストグラムとラグプロットの再表示
hist(TestData01, breaks=50)
rug(TestData01)

image.png

これをひたすら「整形」していく訳です。
Rで度数分布表を作る

h<-hist(TestData01)
h
$breaks
[1] 0.745 0.750 0.755 0.760 0.765 0.770 0.775 0.780 0.785

$counts
[1] 4 2 19 24 23 13 11 4

$density
[1] 8 4 38 48 46 26 22 8

$mids
[1] 0.7475 0.7525 0.7575 0.7625 0.7675 0.7725 0.7775 0.7825

$xname
[1] "TestData01"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

#breaksが階級を区切る値で、countsが度数。

image.png


h <- hist(TestData01, breaks=50)
n <- length(h$counts) # 階級の数
class_names <- NULL # 階級の名前格納用
for(i in 1:n) {
class_names[i] <- paste(h$breaks[i], "~", h$breaks[i+1])
}
frequency_table <- data.frame(class=class_names, frequency=h$counts)

library(xtable)
print(xtable(frequency_table), type = "html")
  class frequency
1 0.745 ~ 0.746 1
2 0.746 ~ 0.747 1
3 0.747 ~ 0.748 2
4 0.748 ~ 0.749 0
5 0.749 ~ 0.75 0
6 0.75 ~ 0.751 0
7 0.751 ~ 0.752 0
8 0.752 ~ 0.753 1
9 0.753 ~ 0.754 1
10 0.754 ~ 0.755 0
11 0.755 ~ 0.756 3
12 0.756 ~ 0.757 4
13 0.757 ~ 0.758 4
14 0.758 ~ 0.759 3
15 0.759 ~ 0.76 5
16 0.76 ~ 0.761 7
17 0.761 ~ 0.762 2
18 0.762 ~ 0.763 7
19 0.763 ~ 0.764 4
20 0.764 ~ 0.765 4
21 0.765 ~ 0.766 7
22 0.766 ~ 0.767 5
23 0.767 ~ 0.768 4
24 0.768 ~ 0.769 1
25 0.769 ~ 0.77 6
26 0.77 ~ 0.771 4
27 0.771 ~ 0.772 2
28 0.772 ~ 0.773 4
29 0.773 ~ 0.774 0
30 0.774 ~ 0.775 3
31 0.775 ~ 0.776 3
32 0.776 ~ 0.777 3
33 0.777 ~ 0.778 3
34 0.778 ~ 0.779 1
35 0.779 ~ 0.78 1
36 0.78 ~ 0.781 1
37 0.781 ~ 0.782 0
38 0.782 ~ 0.783 1
39 0.783 ~ 0.784 2

まぁ何というか…「伝統の職人芸の世界」?

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?