Groupbyの使い方

Last updated at 2025-01-09Posted at 2025-01-09

内容

グループごとで平均、最大値、最小値が算出できるgroupby関数の使い方について、備忘録として記録します。

まずは仮のデータを作成します。

import nampy as np
import pandas as pd

# データフレームを作成
num = np.array([["A", 4], ["B", 5], ["A", 6], ["A", 7], ["B", 8], ["B", 9]] , dtype=object)
data = pd.DataFrame(data=num, columns=["data", "result"])

出力結果

   data result
0    A      4
1    B      5
2    A      6
3    A      7
4    B      8
5    B      9

データを元にグループ分けをする列を指定します。

group = data.groupby('data')

ここでは"data"列に対して、A,Bグループに分けます。
分けたデータを元に平均値、最大値、最小値をえます。

group_mean = group['result'].mean() #平均
group_max = group['result'].max() #最大
group_min = group['result'].min() #最小

上の内容を元にグラフ描写をします。

# グラフに描写
# ラベルをつくる
labels = ["A", "B"]

# 横幅を指定
width = 0.3

# 横軸の幅を指定
left = np.arange(len(labels)) 

plt.bar(left , group_mean , width = width , label = "mean" , color = "coral")
plt.bar(left + width, group_max, width = width , label = "max" , color = "steelblue")
plt.bar(left + width + width , group_min, width = width , label = "min" , color = "forestgreen")
plt.xticks(left + width , labels)
plt.legend()
plt.show()

出力結果

上記の方法でグループごとの平均、最大値、最小値の可視化が可能です。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up