More than 3 years have passed since last update.

Pandas: keys, items, iteritems, iterrows, itertuples の違い。

pandas

Posted at 2021-12-01

お恥ずかしい話ですが、毎回忘れるのでメモします。
Row を列挙したい時は itertuples() を使うと良いです。iterrows() は型情報を失います。

import pandas as pd

df = pd.DataFrame({'species': ['bear', 'bear', 'marsupial'],
                  'population': [1864, 22000, 80000]},
                  index=['panda', 'polar', 'koala'])

df

	species	population
panda	bear	1864
polar	bear	22000
koala	marsupial	80000

keys

DataFrame をそのままイテレータとして使ったり、keys() を使うと column 名を列挙する。

for items in df:
    print(items)

species
population

for items in df.keys():
    print(items)

species
population

items, iteritems

items と iteritems は同じ。(column 名, column 値の Series) の組を列挙する。

for name, items in df.iteritems():
    print("\nname: " + name)
    print(items)

name: species
panda         bear
polar         bear
koala    marsupial
Name: species, dtype: object

name: population
panda     1864
polar    22000
koala    80000
Name: population, dtype: int64

iterrows

iterrows は (row 名, row 値の Series) の組を列挙する。型情報は失われる。

for name, items in df.iterrows():
    print("\nname: " + name)
    print(items)

name: panda
species       bear
population    1864
Name: panda, dtype: object

name: polar
species        bear
population    22000
Name: polar, dtype: object

name: koala
species       marsupial
population        80000
Name: koala, dtype: object

itertuples

itertuples は row 値の組を列挙する。型情報は残る。

for items in df.itertuples():
    print(items)

Pandas(Index='panda', species='bear', population=1864)
Pandas(Index='polar', species='bear', population=22000)
Pandas(Index='koala', species='marsupial', population=80000)

参考

https://pandas.pydata.org/docs/user_guide/basics.html#iteration

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up