More than 5 years have passed since last update.

Pandas: DataFrame.rolling()のごく簡単な例

Last updated at 2019-02-20Posted at 2016-07-14

http://pandas.pydata.org/pandas-docs/stable/groupby.html
ここ読んでいて、突如rolling()という関数が出てきた。
APIリファレンスを見てもよくわからず戸惑ったので、簡単な例でどんなメソッドなのかつかんでみる。

まずは適当に使ってみる。

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: s = pd.Series(range(0,7))

In [4]: s
Out[4]: 
0    0
1    1
2    2
3    3
4    4
5    5
6    6
dtype: int64

In [5]: s.rolling(window=3, min_periods=3).mean()
Out[5]: 
0    NaN
1    NaN
2    1.0
3    2.0
4    3.0
5    4.0
6    5.0
dtype: float64

Out[5]からわかるように、rolling()とmean()により、Series sの各要素について、（その要素を含み）それ以前の3つ分の平均値を算出することができる。つまり移動平均値が算出される。
例えばindex=3について見ると、index=1,2,3の要素の値がそれぞれ1,2,3なので、それら3個の平均値である2.0が出力されている。
index=0, 1の要素については、それ以前に必要なだけの要素数が無いので、計算が行われずNanが出力されている。

indexを遡って計算対象とする要素数を決めるのがwindowであり、有効な計算結果を出すのに最低限必要な要素数を指定するのがmin_periodsである。なので、「4要素の平均を出したい。最低限2要素あれば結果を出力したい」場合は以下のように指定すれば良い。

In [6]: s.rolling(window=4, min_periods=2).mean()
Out[6]: 
0    NaN
1    0.5
2    1.0
3    1.5
4    2.5
5    3.5
6    4.5
dtype: float64

また、center=Trueとすると、起点とするindexとその前後あわせてwindow個の要素を計算対象とする。

In [7]: s.rolling(window=3, min_periods=3, center=True).mean()
Out[7]: 
0    NaN
1    1.0
2    2.0
3    3.0
4    4.0
5    5.0
6    NaN
dtype: float64

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up