この記事で行うこと
SQLで度数分布表を作ります。
度数がゼロの階級も歯抜けさせずに作成します。
PostgreSQLとSQLite3の両方で行います。
※(2019-11-13) スカラーサブクエリーを使う等、SQLをリファクタリングしました。SQLをDataGridで整形しました。
参考ページ(感謝します)
SQLでヒストグラムを作成してみよう
SQLite3 で generate_series(連番)
(SQLite3, MySQL, PHP) 小数点以下は切り上げて整数にしたい
数学英語【和⇒英】英語の数字・数式の読み方 統計学・確率論用語集
テストデータ作成
まずテストデータを作成します。
以下に、PostgreSQLとSQLite3に同じテストデータを作成するPythonコードを記します(例外処理とか省いています)。
テーブル名はfooで、列名scoreにテストの点数を格納することにします。
データ件数は100件で、平均値50点、標準偏差25点の正規分布乱数にしました。
このバラツキの大きさから、0点より小さな点数や、100点以上の点数も生成されるかも?ということが予測できます。
import psycopg2
import sqlite3
from pathlib import Path
from contextlib import closing
import math
import numpy as np
dir_db = Path.home() / "wk"
dbname = "testdb.sqlite3"
with closing(psycopg2.connect("host=localhost port=5432 dbname=default user=default password=secret")) as conn_p, \
closing(sqlite3.connect(dir_db / dbname)) as conn_s:
with conn_p.cursor() as cur_p:
cur_p.execute('''\
DROP TABLE IF EXISTS foo;
''')
conn_s.execute('''\
DROP TABLE IF EXISTS foo;
''')
cur_p.execute('''\
CREATE TABLE foo
(
score INTEGER NOT NULL
)
''')
conn_s.execute('''\
CREATE TABLE foo
(
score INTEGER NOT NULL
)
''')
for ii in range(100):
score = math.floor(np.random.normal(50, 25))
cur_p.execute(f'''\
INSERT INTO foo (score) VALUES ({score})
''')
conn_s.execute(f'''\
INSERT INTO foo (score) VALUES ({score})
''')
conn_p.commit()
conn_s.commit()
生成されたデータは以下の通りです。
default=# select * from foo order by 1;
score
-------
-13
-11
-9
3
6
7
11
13
13
17
18
22
23
24
24
24
25
27
28
28
29
29
29
29
30
30
30
32
33
33
34
36
36
36
36
37
38
39
40
41
41
42
42
44
44
44
45
47
48
51
51
51
52
52
55
56
56
56
56
56
56
57
57
57
58
59
59
59
63
63
63
63
63
63
64
64
65
66
66
66
67
67
68
68
68
72
73
73
74
77
77
79
81
81
85
87
92
97
105
115
(100 rows)
今回は期待通り、0点より小さな点数や、100点以上の点数も生成されました。
度数分布表を作成するSQL
度数分布表の点数全体の範囲は、DBから点数の最小値・最大値を検索するのではなく、人がSQLで与えるものとします。
また、人が与えた最小値(0点)より小さな点数は最初の階級に、最大値(100点)以上の点数は最後の階級に含めるものとします。
以下のようなSQLを書いてみました。
SQLのconst_giveの部分が人が与える定数で、min_scoreが点数の最小値、max_scoreが点数の最大値、nbinsが階級数です。
SQL文中のconstは全ての定数、bin_widthは階級幅です。
PostgreSQL用のSQL
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
10 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
SUM(frequency) AS 度数
FROM (
(SELECT (SELECT GREATEST(0, LEAST(nbins - 1, CAST(FLOOR((foo.score - min_score) / bin_width) AS INTEGER)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index)
UNION ALL
(SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
) AS hoge
GROUP BY bin_index
ORDER BY bin_index
SQLite3用のSQL
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
10 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
SUM(frequency) AS 度数
FROM (
SELECT (SELECT MAX(0, MIN(nbins - 1, CAST((foo.score - min_score) / bin_width AS INT)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index
UNION ALL
SELECT *
FROM (SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
)
GROUP BY bin_index
ORDER BY bin_index
PostgreSQLの検索結果
インデックス | 下限 | 上限 | 度数
--------------+---------------------+----------------------+------
0 | 0.0000000000000000 | 10.0000000000000000 | 6
1 | 10.0000000000000000 | 20.0000000000000000 | 5
2 | 20.0000000000000000 | 30.0000000000000000 | 13
3 | 30.0000000000000000 | 40.0000000000000000 | 14
4 | 40.0000000000000000 | 50.0000000000000000 | 11
5 | 50.0000000000000000 | 60.0000000000000000 | 19
6 | 60.0000000000000000 | 70.0000000000000000 | 17
7 | 70.0000000000000000 | 80.0000000000000000 | 7
8 | 80.0000000000000000 | 90.0000000000000000 | 4
9 | 90.0000000000000000 | 100.0000000000000000 | 4
(10 rows)
SQLite3の検索結果
インデッ 下限 上限 度数
---- ------------- ---- ----
0 0.0 10.0 6
1 10.0 20.0 5
2 20.0 30.0 13
3 30.0 40.0 14
4 40.0 50.0 11
5 50.0 60.0 19
6 60.0 70.0 17
7 70.0 80.0 7
8 80.0 90.0 4
9 90.0 100.0 4
歯抜けがないか確認
SQL文の階級数を10から100に変更して、歯抜けが発生していないか確認してみます。
PostgreSQL用のSQL(歯抜けがないか確認)
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
- 10 AS nbins
+ 100 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
SUM(frequency) AS 度数
FROM (
(SELECT (SELECT GREATEST(0, LEAST(nbins - 1, CAST(FLOOR((foo.score - min_score) / bin_width) AS INTEGER)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index)
UNION ALL
(SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
) AS hoge
GROUP BY bin_index
ORDER BY bin_index
SQLite3用のSQL(歯抜けがないか確認)
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
- 10 AS nbins
+ 100 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
SUM(frequency) AS 度数
FROM (
SELECT (SELECT MAX(0, MIN(nbins - 1, CAST((foo.score - min_score) / bin_width AS INT)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index
UNION ALL
SELECT *
FROM (SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
)
GROUP BY bin_index
ORDER BY bin_index
PostgreSQLの検索結果(歯抜けがないか確認)
インデックス | 下限 | 上限 | 度数
--------------+-------------------------+--------------------------+------
0 | 0.00000000000000000000 | 1.00000000000000000000 | 3
1 | 1.00000000000000000000 | 2.00000000000000000000 | 0
2 | 2.00000000000000000000 | 3.00000000000000000000 | 0
3 | 3.00000000000000000000 | 4.00000000000000000000 | 1
4 | 4.00000000000000000000 | 5.00000000000000000000 | 0
5 | 5.00000000000000000000 | 6.00000000000000000000 | 0
6 | 6.00000000000000000000 | 7.00000000000000000000 | 1
7 | 7.00000000000000000000 | 8.00000000000000000000 | 1
8 | 8.00000000000000000000 | 9.00000000000000000000 | 0
9 | 9.00000000000000000000 | 10.00000000000000000000 | 0
10 | 10.00000000000000000000 | 11.00000000000000000000 | 0
11 | 11.00000000000000000000 | 12.00000000000000000000 | 1
12 | 12.00000000000000000000 | 13.00000000000000000000 | 0
13 | 13.00000000000000000000 | 14.00000000000000000000 | 2
14 | 14.00000000000000000000 | 15.00000000000000000000 | 0
15 | 15.00000000000000000000 | 16.00000000000000000000 | 0
16 | 16.00000000000000000000 | 17.00000000000000000000 | 0
17 | 17.00000000000000000000 | 18.00000000000000000000 | 1
18 | 18.00000000000000000000 | 19.00000000000000000000 | 1
19 | 19.00000000000000000000 | 20.00000000000000000000 | 0
20 | 20.00000000000000000000 | 21.00000000000000000000 | 0
21 | 21.00000000000000000000 | 22.00000000000000000000 | 0
22 | 22.00000000000000000000 | 23.00000000000000000000 | 1
23 | 23.00000000000000000000 | 24.00000000000000000000 | 1
24 | 24.00000000000000000000 | 25.00000000000000000000 | 3
25 | 25.00000000000000000000 | 26.00000000000000000000 | 1
26 | 26.00000000000000000000 | 27.00000000000000000000 | 0
27 | 27.00000000000000000000 | 28.00000000000000000000 | 1
28 | 28.00000000000000000000 | 29.00000000000000000000 | 2
29 | 29.00000000000000000000 | 30.00000000000000000000 | 4
30 | 30.00000000000000000000 | 31.00000000000000000000 | 3
31 | 31.00000000000000000000 | 32.00000000000000000000 | 0
32 | 32.00000000000000000000 | 33.00000000000000000000 | 1
33 | 33.00000000000000000000 | 34.00000000000000000000 | 2
34 | 34.00000000000000000000 | 35.00000000000000000000 | 1
35 | 35.00000000000000000000 | 36.00000000000000000000 | 0
36 | 36.00000000000000000000 | 37.00000000000000000000 | 4
37 | 37.00000000000000000000 | 38.00000000000000000000 | 1
38 | 38.00000000000000000000 | 39.00000000000000000000 | 1
39 | 39.00000000000000000000 | 40.00000000000000000000 | 1
40 | 40.00000000000000000000 | 41.00000000000000000000 | 1
41 | 41.00000000000000000000 | 42.00000000000000000000 | 2
42 | 42.00000000000000000000 | 43.00000000000000000000 | 2
43 | 43.00000000000000000000 | 44.00000000000000000000 | 0
44 | 44.00000000000000000000 | 45.00000000000000000000 | 3
45 | 45.00000000000000000000 | 46.00000000000000000000 | 1
46 | 46.00000000000000000000 | 47.00000000000000000000 | 0
47 | 47.00000000000000000000 | 48.00000000000000000000 | 1
48 | 48.00000000000000000000 | 49.00000000000000000000 | 1
49 | 49.00000000000000000000 | 50.00000000000000000000 | 0
50 | 50.00000000000000000000 | 51.00000000000000000000 | 0
51 | 51.00000000000000000000 | 52.00000000000000000000 | 3
52 | 52.00000000000000000000 | 53.00000000000000000000 | 2
53 | 53.00000000000000000000 | 54.00000000000000000000 | 0
54 | 54.00000000000000000000 | 55.00000000000000000000 | 0
55 | 55.00000000000000000000 | 56.00000000000000000000 | 1
56 | 56.00000000000000000000 | 57.00000000000000000000 | 6
57 | 57.00000000000000000000 | 58.00000000000000000000 | 3
58 | 58.00000000000000000000 | 59.00000000000000000000 | 1
59 | 59.00000000000000000000 | 60.00000000000000000000 | 3
60 | 60.00000000000000000000 | 61.00000000000000000000 | 0
61 | 61.00000000000000000000 | 62.00000000000000000000 | 0
62 | 62.00000000000000000000 | 63.00000000000000000000 | 0
63 | 63.00000000000000000000 | 64.00000000000000000000 | 6
64 | 64.00000000000000000000 | 65.00000000000000000000 | 2
65 | 65.00000000000000000000 | 66.00000000000000000000 | 1
66 | 66.00000000000000000000 | 67.00000000000000000000 | 3
67 | 67.00000000000000000000 | 68.00000000000000000000 | 2
68 | 68.00000000000000000000 | 69.00000000000000000000 | 3
69 | 69.00000000000000000000 | 70.00000000000000000000 | 0
70 | 70.00000000000000000000 | 71.00000000000000000000 | 0
71 | 71.00000000000000000000 | 72.00000000000000000000 | 0
72 | 72.00000000000000000000 | 73.00000000000000000000 | 1
73 | 73.00000000000000000000 | 74.00000000000000000000 | 2
74 | 74.00000000000000000000 | 75.00000000000000000000 | 1
75 | 75.00000000000000000000 | 76.00000000000000000000 | 0
76 | 76.00000000000000000000 | 77.00000000000000000000 | 0
77 | 77.00000000000000000000 | 78.00000000000000000000 | 2
78 | 78.00000000000000000000 | 79.00000000000000000000 | 0
79 | 79.00000000000000000000 | 80.00000000000000000000 | 1
80 | 80.00000000000000000000 | 81.00000000000000000000 | 0
81 | 81.00000000000000000000 | 82.00000000000000000000 | 2
82 | 82.00000000000000000000 | 83.00000000000000000000 | 0
83 | 83.00000000000000000000 | 84.00000000000000000000 | 0
84 | 84.00000000000000000000 | 85.00000000000000000000 | 0
85 | 85.00000000000000000000 | 86.00000000000000000000 | 1
86 | 86.00000000000000000000 | 87.00000000000000000000 | 0
87 | 87.00000000000000000000 | 88.00000000000000000000 | 1
88 | 88.00000000000000000000 | 89.00000000000000000000 | 0
89 | 89.00000000000000000000 | 90.00000000000000000000 | 0
90 | 90.00000000000000000000 | 91.00000000000000000000 | 0
91 | 91.00000000000000000000 | 92.00000000000000000000 | 0
92 | 92.00000000000000000000 | 93.00000000000000000000 | 1
93 | 93.00000000000000000000 | 94.00000000000000000000 | 0
94 | 94.00000000000000000000 | 95.00000000000000000000 | 0
95 | 95.00000000000000000000 | 96.00000000000000000000 | 0
96 | 96.00000000000000000000 | 97.00000000000000000000 | 0
97 | 97.00000000000000000000 | 98.00000000000000000000 | 1
98 | 98.00000000000000000000 | 99.00000000000000000000 | 0
99 | 99.00000000000000000000 | 100.00000000000000000000 | 2
(100 rows)
SQLite3の検索結果(歯抜けがないか確認)
インデッ 下限 上限 度数
---- ------------- ---- ----
0 0.0 1.0 3
1 1.0 2.0 0
2 2.0 3.0 0
3 3.0 4.0 1
4 4.0 5.0 0
5 5.0 6.0 0
6 6.0 7.0 1
7 7.0 8.0 1
8 8.0 9.0 0
9 9.0 10.0 0
10 10.0 11.0 0
11 11.0 12.0 1
12 12.0 13.0 0
13 13.0 14.0 2
14 14.0 15.0 0
15 15.0 16.0 0
16 16.0 17.0 0
17 17.0 18.0 1
18 18.0 19.0 1
19 19.0 20.0 0
20 20.0 21.0 0
21 21.0 22.0 0
22 22.0 23.0 1
23 23.0 24.0 1
24 24.0 25.0 3
25 25.0 26.0 1
26 26.0 27.0 0
27 27.0 28.0 1
28 28.0 29.0 2
29 29.0 30.0 4
30 30.0 31.0 3
31 31.0 32.0 0
32 32.0 33.0 1
33 33.0 34.0 2
34 34.0 35.0 1
35 35.0 36.0 0
36 36.0 37.0 4
37 37.0 38.0 1
38 38.0 39.0 1
39 39.0 40.0 1
40 40.0 41.0 1
41 41.0 42.0 2
42 42.0 43.0 2
43 43.0 44.0 0
44 44.0 45.0 3
45 45.0 46.0 1
46 46.0 47.0 0
47 47.0 48.0 1
48 48.0 49.0 1
49 49.0 50.0 0
50 50.0 51.0 0
51 51.0 52.0 3
52 52.0 53.0 2
53 53.0 54.0 0
54 54.0 55.0 0
55 55.0 56.0 1
56 56.0 57.0 6
57 57.0 58.0 3
58 58.0 59.0 1
59 59.0 60.0 3
60 60.0 61.0 0
61 61.0 62.0 0
62 62.0 63.0 0
63 63.0 64.0 6
64 64.0 65.0 2
65 65.0 66.0 1
66 66.0 67.0 3
67 67.0 68.0 2
68 68.0 69.0 3
69 69.0 70.0 0
70 70.0 71.0 0
71 71.0 72.0 0
72 72.0 73.0 1
73 73.0 74.0 2
74 74.0 75.0 1
75 75.0 76.0 0
76 76.0 77.0 0
77 77.0 78.0 2
78 78.0 79.0 0
79 79.0 80.0 1
80 80.0 81.0 0
81 81.0 82.0 2
82 82.0 83.0 0
83 83.0 84.0 0
84 84.0 85.0 0
85 85.0 86.0 1
86 86.0 87.0 0
87 87.0 88.0 1
88 88.0 89.0 0
89 89.0 90.0 0
90 90.0 91.0 0
91 91.0 92.0 0
92 92.0 93.0 1
93 93.0 94.0 0
94 94.0 95.0 0
95 95.0 96.0 0
96 96.0 97.0 0
97 97.0 98.0 1
98 98.0 99.0 0
99 99.0 100.0 2
度数がゼロの階級もゼロのデータが作成されて、歯抜けがないことが確認できました。
追記(2019-10-03)
0点より小さな点数と、100点超の点数について、結果表の下限列・上限列に値を反映させてみます。
以下、PostgreSQLのみ記述しますが、変更内容はSQLite3も同じです。
PostgreSQL用のSQL
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
10 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
- SUM(lower_limit) AS 下限,
- SUM(upper_limit) AS 上限,
+ MIN(lower_limit) AS 下限,
+ MAX(upper_limit) AS 上限,
SUM(frequency) AS 度数
FROM (
(SELECT (SELECT GREATEST(0, LEAST(nbins - 1, CAST(FLOOR((foo.score - min_score) / bin_width) AS INTEGER)))
FROM const) AS bin_index,
- 0 AS lower_limit,
- 0 AS upper_limit,
+ 1.0 * MIN(score) AS lower_limit,
+ 1.0 * MAX(score) AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index)
UNION ALL
(SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
) AS hoge
GROUP BY bin_index
ORDER BY bin_index
PostgreSQLの検索結果
インデックス | 下限 | 上限 | 度数
--------------+---------------------+---------------------+------
0 | -13.0 | 10.0000000000000000 | 6
1 | 10.0000000000000000 | 20.0000000000000000 | 5
2 | 20.0000000000000000 | 30.0000000000000000 | 13
3 | 30.0000000000000000 | 40.0000000000000000 | 14
4 | 40.0000000000000000 | 50.0000000000000000 | 11
5 | 50.0000000000000000 | 60.0000000000000000 | 19
6 | 60.0000000000000000 | 70.0000000000000000 | 17
7 | 70.0000000000000000 | 80.0000000000000000 | 7
8 | 80.0000000000000000 | 90.0000000000000000 | 4
9 | 90.0000000000000000 | 115.0 | 4
(10 rows)
追記(2019-10-11)
アスタリスクで簡易ヒストグラムを表示させてみました。
1.0で割っている箇所については、アスタリスク表示が長すぎる時は数値を調整します。
PostgreSQL用のSQL
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
10 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
- SUM(frequency) AS 度数
+ SUM(frequency) AS 度数,
+ REPEAT('*', CAST(ROUND(SUM(frequency) / 1.0) AS INTEGER)) AS 簡易ヒストグラム
FROM (
(SELECT (SELECT GREATEST(0, LEAST(nbins - 1, CAST(FLOOR((foo.score - min_score) / bin_width) AS INTEGER)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index)
UNION ALL
(SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
) AS hoge
GROUP BY bin_index
ORDER BY bin_index
PostgreSQLの検索結果
インデックス | 下限 | 上限 | 度数 | 簡易ヒストグラム
--------------+---------------------+----------------------+------+---------------------
0 | 0.0000000000000000 | 10.0000000000000000 | 6 | ******
1 | 10.0000000000000000 | 20.0000000000000000 | 5 | *****
2 | 20.0000000000000000 | 30.0000000000000000 | 13 | *************
3 | 30.0000000000000000 | 40.0000000000000000 | 14 | **************
4 | 40.0000000000000000 | 50.0000000000000000 | 11 | ***********
5 | 50.0000000000000000 | 60.0000000000000000 | 19 | *******************
6 | 60.0000000000000000 | 70.0000000000000000 | 17 | *****************
7 | 70.0000000000000000 | 80.0000000000000000 | 7 | *******
8 | 80.0000000000000000 | 90.0000000000000000 | 4 | ****
9 | 90.0000000000000000 | 100.0000000000000000 | 4 | ****
(10 rows)
SQLite3用のSQL
SQLite3はこんなSQLしか思いつきませんでした。
アドホック分析では良いんじゃないでしょうか(^^;
WITH
RECURSIVE
generate_index (ii) AS
(
SELECT 0
UNION ALL
SELECT ii + 1
FROM generate_index
),
const_give AS
(
SELECT 0 AS min_score,
100 AS max_score,
10 AS nbins
),
const AS
(
SELECT min_score,
max_score,
nbins,
1.0 * (max_score - min_score) / nbins AS bin_width
FROM const_give
)
SELECT bin_index AS インデックス,
SUM(lower_limit) AS 下限,
SUM(upper_limit) AS 上限,
- SUM(frequency) AS 度数
+ SUM(frequency) AS 度数,
+ SUBSTR('*****************************************', -ROUND(SUM(frequency) / 1.0)) AS 簡易ヒストグラム
FROM (
SELECT (SELECT MAX(0, MIN(nbins - 1, CAST((foo.score - min_score) / bin_width AS INT)))
FROM const) AS bin_index,
0 AS lower_limit,
0 AS upper_limit,
COUNT(*) AS frequency
FROM foo
GROUP BY bin_index
UNION ALL
SELECT *
FROM (SELECT ii,
(SELECT min_score + bin_width * ii FROM const),
(SELECT min_score + bin_width * (ii + 1) FROM const),
0
FROM generate_index
LIMIT (SELECT nbins FROM const))
)
GROUP BY bin_index
ORDER BY bin_index
SQLite3の検索結果
インデッ 下限 上限 度数 簡易ヒス
---- ------------- ---- ---- ----
0 0.0 10.0 6 ******
1 10.0 20.0 5 *****
2 20.0 30.0 13 *************
3 30.0 40.0 14 **************
4 40.0 50.0 11 ***********
5 50.0 60.0 19 *******************
6 60.0 70.0 17 *****************
7 70.0 80.0 7 *******
8 80.0 90.0 4 ****
9 90.0 100.0 4 ****