More than 1 year has passed since last update.

【Python】pandas.read_csvの列指定

Posted at 2022-08-02

概要

pandasでcsvを読み込むときの列指定方法。
データ分析の際、データが大きいかつ不要なカラムが多い時はこれを使うと読み込み速度が大きく変わる。

データ

df = pd.read_csv("example.csv")
print(df)
 
# a,b,c,d
# 1,20,2,100
# 2,30,3,200
# 3,10,1,300

書き方①

usecolsオプションを指定する。
このオプションで指定したカラムのみを読み込む。
カラムをintで指定することも文字列で指定することもできる。

df = pd.read_csv("example.csv", usecols=[0, 2])
print(df)
 
#    a  c
# 0  1  2
# 1  2  3
# 2  3  1
 
df = pd.read_csv("example.csv", usecols=["a", "c"])
print(df)
 
#    a  c
# 0  1  2
# 1  2  3
# 2  3  1

書き方②

除外するカラムを指定する場合は以下のように書く。

df = pd.read_csv("example.csv", usecols=lambda x: x not in ["a", "c"] )
print(df)

#     b    d
# 0  20  100
# 1  30  200
# 2  10  300

参考

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up