More than 1 year has passed since last update.

任意の型のカラム名を取り出す方法（pandas）

Last updated at 2023-05-07Posted at 2023-05-07

結論

以下のように型を指定する．

#int型を取り出す時
df.columns[df.dtypes == "int"]

#float型を取り出す時
df.columns[df.dtypes == "gfloat"]

#category型を取り出す時
df.columns[df.dtypes == "category"]

出力

Index(['columns1', 'columns2',...], dtype='object')

dfは，$N\times M$のDataFrameです．

簡単な例

小数と整数値（カテゴリ）が混ざったdafaframeを生成します．

import pandas as pd

point_data = pd.DataFrame(np.arange(9.0).reshape(3, 3),
                  columns=['point_0', 'point_1', 'point_2'])
category_data = pd.DataFrame(np.arange(6).reshape(3, 2),
                  columns=['category_0', 'category_1'])
df = pd.concat([point_data, category_data], axis = 1)

dfの中身は，以下のようになっています．

   point_0  point_1  point_2  category_0  category_1
0      0.0      1.0      2.0           0           1
1      3.0      4.0      5.0           2           3
2      6.0      7.0      8.0           4           5

今回は例として，int型のカラムを抽出します．

df.columns[df.dtypes == "int"]

出力

Index(['category_0', 'category_1'], dtype='object')

データフレームの中身を確認します．

df[df.columns[df.dtypes == "int"]]

   category_0  category_1
0           0           1
1           2           3
2           4           5

用途

カテゴリ変数に強い機械学習モデル"Catboost"の引数に，カテゴリ変数名(int型 or category型)を指定する時などに使えます．
その時は，

import catboost as cb

category_features_name = point_data.columns[(point_data.dtypes == "int") | (point_data.dtypes == "category")]
clf = cb.CatBoostClassifier(cat_features = category_features_name, verbose = 0)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up