28
19

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Pandas: groupby()してグループ単位で値補完

Posted at

Pandasのお勉強メモ。

http://pandas.pydata.org/pandas-docs/stable/groupby.html
ここ読んでいて、group byの値補完の例が分かりにくかったので簡単な例を書いてみる。

下準備。

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: key = list('ABCABCABC')

In [4]: value = [1,2,3,np.nan,np.nan,np.nan,4,4,4]

In [5]: df = pd.DataFrame({'key': key, 'value': value})

In [6]: df
Out[6]: 
  key  value
0   A    1.0
1   B    2.0
2   C    3.0
3   A    NaN
4   B    NaN
5   C    NaN
6   A    4.0
7   B    4.0
8   C    4.0

グループ単位でffill()

グループ化せずにffill()すると、indexが2の value 3.0で3つのNaNすべてが補完される。

In [7]: df.ffill()
Out[7]: 
  key  value
0   A    1.0
1   B    2.0
2   C    3.0
3   A    3.0
4   B    3.0
5   C    3.0
6   A    4.0
7   B    4.0
8   C    4.0

keyでグループ化してからffill()すると、グループ単位にNaNの直前の値でNaNを補完することになる。従ってindexが0, 1, 2(keyがそれぞれA, B, C)のvalue 1.0, 2.0, 3.0で index 3, 4, 5(keyがそれぞれA, B, C)のvalueが補完される。

In [8]: df.groupby('key').ffill()
Out[8]: 
  key  value
0   A    1.0
1   B    2.0
2   C    3.0
3   A    1.0
4   B    2.0
5   C    3.0
6   A    4.0
7   B    4.0
8   C    4.0

グループ単位で平均を取って埋める

valueNaNになっているところに、グループ単位の平均値をとって埋める。

In [9]: f = lambda x: x.fillna(x.mean())

In [10]: transformed = df.groupby('key').transform(f)

In [11]: transformed
Out[11]: 
   value
0    1.0
1    2.0
2    3.0
3    2.5
4    3.0
5    3.5
6    4.0
7    4.0
8    4.0

埋める前と埋めた後でそれぞれグループ毎の平均を取ると、同じ値が得られる(GroupBy.mean()NaN計算対象から除外する)。

In [12]: df.groupby('key').mean()
Out[12]: 
     value
key       
A      2.5
B      3.0
C      3.5

In [13]: transformed.groupby(key).mean()
Out[13]: 
   value
A    2.5
B    3.0
C    3.5
28
19
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
28
19

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?