8
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

groupby rollingのサンプル

8
Last updated at Posted at 2020-02-09

groupby rollingの使い方の簡単なサンプルです。

テストデータ


data=[
{"氏名":"山田","回数":5, "国語":40 ,"算数":40 },
{"氏名":"山田","回数":4, "国語":40 ,"算数":50 },
{"氏名":"山田","回数":3, "国語":30 ,"算数":40 },
{"氏名":"山田","回数":2, "国語":20 ,"算数":30 },
{"氏名":"山田","回数":1, "国語":10 ,"算数":20 },

{"氏名":"野田","回数":1, "国語":10 ,"算数":30 },
{"氏名":"野田","回数":2, "国語":20 ,"算数":40 },
{"氏名":"野田","回数":3, "国語":30 ,"算数":20 },
{"氏名":"野田","回数":4, "国語":40 ,"算数":50 },
{"氏名":"野田","回数":5, "国語":40 ,"算数":70 },

    
]
import pandas as pd
df=pd.DataFrame(data)
df.index.name="idx"
df

idx 氏名 回数 国語 算数
0 山田 5 40 40
1 山田 4 40 50
2 山田 3 30 40
3 山田 2 20 30
4 山田 1 10 20
5 野田 1 10 30
6 野田 2 20 40
7 野田 3 30 20
8 野田 4 40 50
9 野田 5 40 70

氏名毎の「回数」の直近3回の和を出す過程でgroupby rollingを使います。

df2=df.sort_values(["氏名","回数"])  #この行を忘れると、こっそりとバグる

df2_sum=df2.groupby(["氏名"]).rolling(3)[["算数"]].sum()
df2_sum.reset_index(inplace=True)
df2_sum.set_index(["idx"],inplace=True)

df2["直近3回の算数の和"]=df2_sum["算数"]
df2

結果

idx 氏名 回数 国語 算数 直近3回の算数の和
4 山田 1 10 20 nan
3 山田 2 20 30 nan
2 山田 3 30 40 90
1 山田 4 40 50 120
0 山田 5 40 40 130
5 野田 1 10 30 nan
6 野田 2 20 40 nan
7 野田 3 30 20 90
8 野田 4 40 50 110
9 野田 5 40 70 140

もっとスマートなやり方がありそうな気がする。ご存じの方は教えていただけると助かります。

@daikiclimateさんから、もっとスマートなコードをいただきました。

df = df.sort_values(["氏名","回数"]).reset_index().drop("idx", axis = 1)
df["math_sum"]=df.groupby(["氏名"]).rolling(3)["算数"].sum().reset_index()["算数"]

参考

Group by: split-apply-combine

8
3
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
8
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?