More than 5 years have passed since last update.

Python ライブラリ Pandas で基礎統計量を計算する方法

Last updated at 2019-02-14Posted at 2019-02-13

ぶっちゃけ, 雰囲気で Pandas イジっていますが, 基礎統計量を超カンタンに計算できたので, シェアします.

コード


# pandas をインポート
import pandas as pd

# tsv ファイルを読み込み
data = pd.read_csv(
    'example_reviews.tsv',
    delimiter="\t",
    names=[
        'FOOD',
        'SERVICE',
        'AMBIENCE',
        'COST_PERFORMANCE',
        'TOTAL'
    ]
)

# 基礎統計量を表示
print(data.describe())

結果


$ python descriptive_statistics_value.py 
            FOOD    SERVICE   AMBIENCE  COST_PERFORMANCE      TOTAL
count  30.000000  30.000000  30.000000         30.000000  30.000000
mean    3.066667   2.466667   3.183333          2.800000   2.933333
std     1.150212   1.090186   1.125591          1.141687   1.135124
min     1.000000   1.000000   1.000000          1.000000   1.000000
25%     2.000000   2.000000   2.625000          2.000000   2.000000
50%     3.250000   2.000000   3.000000          3.000000   3.000000
75%     4.000000   3.000000   4.000000          3.875000   3.875000
max     5.000000   4.500000   5.000000          5.000000   4.500000

サンプルデータ

4.5	4.5	5	4	4.5
3.5	2	4	3	3.5
2	2	4	2	3
4	3	3	4	4
3.5	3	4	3.5	3.5
5	3	4.5	4	4.5
4	3	3	3	4
3	2	3	2	3
4	3	3	5	4
1	1	1	1	1
2.5	1.5	3	2	2
1	2	2.5	1	1
2.5	2	2	3	2.5
2	2.5	3	2	2
3	3	3	3	3
2	2	3.5	2	2
1	1	1	1	1
2	1.5	1.5	2	2
4	1	5	5	3
4	2	2	2.5	3
3.5	3.5	3.5	4	3.5
4	1	3	2	3
3.5	3	3	3.5	3.5
1.5	2	3	2	2
3	1	2	2.5	2
3	3	4	3	3
4	4.5	5	4	4.5
4.5	4.5	4.5	4	4.5
2	2	2	1	1
4.5	4.5	4.5	3	4.5

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up