pandasのresampleとrolling

Last updated at 2017-08-19Posted at 2017-08-19

はじめに

pandasの resample と rolling について、似てるようで似てなくて毎回調べるので、簡単にまとめておきます。

version : pandas v0.20.3

参考

resample

関数本体

pandas.DataFrame.resample — pandas 0.20.3 documentation

DataFrame.resample(
    rule,
    how=None,
    axis=0,
    fill_method=None,
    closed=None,
    label=None,
    convention='start',
    kind=None,
    loffset=None,
    limit=None,
    base=0,
    on=None,
    level=None
)

後続する関数

関数	説明
first	未来方向に最も近い値
last	過去方向に最も近い値
bfill	backward fill, 未来方向に最も近い値（`NaN`は埋める）
ffill	forward fill, 過去方向に最も近い値（`NaN`は埋める）
count	値の個数
unique	ユニークな値の個数
max	最大値
min	最小値
mean	平均値
median	中央値
sum	合計値
var	分散
std	標準偏差
ohlc	始値(opning)、最高値(highest)、最安値(lowest)、終値(closing)
pad	= ffill

rolling

関数本体

pandas.DataFrame.rolling — pandas 0.20.3 documentation

DataFrame.rolling(
    window,
    min_periods=None,
    freq=None,
    center=False,
    win_type=None,
    on=None,
    axis=0,
    closed=None
)

後続する関数

関数	説明
count	値の個数
max	最大値
min	最小値
sum	合計値
mean	平均値
median	中央値
var	分散
std	標準偏差
cov	分散共分散行列
corr	相関行列
skew	歪度 (3次モーメント)
kurt	尖度 (4次モーメント)
quantile	分位値
apply	独自関数による集計

独自関数による集計

rolling().apply()により、独自の集計ができる

例)FIRフィルタ、移動平均フィルタ

import numpy as np

# フィルタ係数
b = np.ones(5) / 5

def f(x):
    # x は窓内の値が入った配列
    # x[0]が最も古い、x[-1]が最も新しい値

    # 集計後の値を return する
    return np.sum(b*x)

# 適用
series.rolling(5, center=True).apply(f)
# series.rolling(5, center=True).apply(lambda x : np.sum(b*x)) でもOK

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up