More than 1 year has passed since last update.

DataFrameに対する理解確認メモ

Posted at 2024-02-01

データサイエンス100本ノックをやりながら、理解を深めるための個人的メモに近いもの。

DataFrameに対する表示列の指定

> df_receipt
sales_ymd	sales_epoch	store_cd	receipt_no	receipt_sub_no	customer_id	product_cd	quantity	amount
0	20181103	1541203200	S14006	112	1	CS006214000001	P070305012	1	158
1	20181118	1542499200	S13008	1132	2	CS008415000097	P070701017	1	81
2	20170712	1499817600	S14028	1102	1	CS028414000014	P060101005	1	170

上記DataFrameに対してstore_cdの列を取得するには単純に下記のようにすると取得できる

> df_receipt["store_cd"]
0         S14006
1         S13008
2         S14028
3         S14042
4         S14025

このときの型はpandas.core.series.Seriesになっている
ちょっとよくわかっていないのが、この状態からこの用に書くと

> df_receipt[["store_cd"]]
	store_cd
0	S14006
1	S13008
2	S14028
3	S14042
4	S14025

となり、ほぼほぼ見た目は変わらないが型はpandas.core.frame.DataFrameになる。
結果から推察すると、
df_receiptに対して、配列を渡すことで、渡した配列を結合した行列が出来上がるということか。
カギカッコが連続してある場合の、分解というか理解というかはちょっとなれないなぁ

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up