More than 3 years have passed since last update.

pandas Tips【備忘録】

Last updated at 2021-12-11Posted at 2021-12-07

pandasが大好きなので、使っていく上で必要になるテクニックを備忘録的に書いていきます。

行を取り出す (loc, iloc)

pandasで行を取り出したい場合、スライス表記 (ex. df['R1':'R1'])を使うという手もありますが、１行のみを取り出したい場合は冗長ですし、型がSeriesではなくDataFrameになってしまいます。

そこでloc, またはilocを使います。df.loc['R1']とすることで、dfから'R1'行をSeries型で取り出すことができます。

import pandas as pd
from pandas import Series, DataFrame

df = DataFrame(
    [
     [1, 2, 3],
     [4, 5, 6]
    ],
    columns = ['A', 'B', 'C'],
    index = ['R1', 'R2']
)

print(df)
#     A  B  C
# R1  1  2  3
# R2  4  5  6

print(df['A'])
# R1    1
# R2    4

print(df['R1':'R1'])
#     A  B  C
# R1  1  2  3

print(type(df['R1':'R1']))
# <class 'pandas.core.frame.DataFrame'>

print(df.loc['R1'])
# A    1
# B    2
# C    3

print(type(df.loc['R1']))
# <class 'pandas.core.series.Series'>

print(df.iloc[0])
# A    1
# B    2
# C    3

locは（多分）locate, locationの略で、df.loc[行, 列]のようにインデックスを指定することでDataFrameの一部を取り出すことができ、スライス表記にも対応しています。

似た関数にatがあります。これはその名の通り一点を取り出す関数で、df.at[行, 列]とすることで１つの値を取り出すことができますが、スライス表記には対応しておらず、行や列を取り出すこともできません。

行の追加

index(行名)に意味がある場合

日付や識別名などを使用していて、indexに意味がある場合は新しい行名を指定して代入することで行を追加することができます。

df = DataFrame(
    [
     [1, 2, 3],
     [4, 5, 6]
    ],
    columns = ['A', 'B', 'C'],
    index = ['R1', 'R2']
)

# 追加したい行(リスト形式)
new_row = [10, 11, 12]

# 新しい行名を指定して追加
df.loc['R3'] = new_row
print(df)
#      A   B   C
# R1   1   2   3
# R2   4   5   6
# R3  10  11  12

index(行名)に意味がなく、ただの通し番号の場合。

実際のところこっちのケースのほうが多いのではないでしょうか。

この場合も上記と同様にdf.loc[len(df)] = new_rowなどとすれば、行を追加できます。

しかしこれでは、「行を追加している」感が薄いので、慣れ親しんだappendを使いたくなりますが、Pandasのappendには注意点がいくつかあります。

追加する行として指定できるのは、DataFrame, Series, dict
Seriesを追加する場合は、Seriesのindexがカラム名、nameが行名に対応する。
行名を指定せず連番にしたい場合、ignore_index=Trueとする必要がある。
リスト型のappendメソッドと違い、元変数に再代入する必要がある。

# -*- coding: utf-8 -*-
"""
Spyder Editor

This is a temporary script file.
"""

import pandas as pd
from pandas import Series, DataFrame

df = DataFrame(
    [
     [1, 2, 3],
     [4, 5, 6]
    ],
    columns = ['A', 'B', 'C'],
)

# 追加したい行(リスト形式)
new_row = [10, 11, 12]

# リストをSeriesに変換してappend
print(
    df.append(Series(new_row, index=df.columns), ignore_index=True)
)
#     A   B   C
# 0   1   2   3
# 1   4   5   6
# 2  10  11  12

# リストを辞書に変換してappend
print(
    df.append(dict(zip(df.columns, new_row)), ignore_index=True)
)
#     A   B   C
# 0   1   2   3
# 1   4   5   6
# 2  10  11  12

# 再代入していないので元データは変わっていない
print(df)
#    A  B  C
# 0  1  2  3
# 1  4  5  6

# 再代入
df = df.append(Series(new_row, index=df.columns), ignore_index=True)
print(df)
#     A   B   C
# 0   1   2   3
# 1   4   5   6
# 2  10  11  12

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up