More than 1 year has passed since last update.

Python Pandas

Python

Last updated at 2023-02-20Posted at 2022-11-03

参考URL

インストール

pip install pandas

データ構造

用語	説明
横の列	columns
縦の行	index

種類	説明
DataFrame	エクセルの表形式のように行と列で成り立つ
Sereies	DataFrameから1列取り出した型リストにindexがくっついたイメージ

使用例

import pandas as pd

#read_csv: 変数へデータの代入
df = pd.read_csv('data.csv',encoding='shift-jis')

#実行結果:pandas.core.frame.DataFrame
type(df)

#最大表示行数
##デフォルトでは10行で表示
pd.set_option('display.max_rows')

##Noneを指定すると全て表示される
pd.set_option('display.max_rows',None)

#最大表示列数
pd.set_option('display.max_columns',None)

#オプションのリセット
pd.reset_option('display.max_rows')

#データフレームの最初の5行,指定行(10行)
print(df.head())
print(df.head(10))

#データフレームの最後の5行,指定行(10行)
print(df.tail())
print(df.tail(10))

#データフレームから指定行(10行)ランダムにデータを取る
print(df.sample(10))

#データフレームの情報
print(df.info())

#データフレームの統計情報,小数点以下四捨五入(round(0))
print(df.describe())
print(df.describe().round(0))

##データフレームの列名取得
print(df.columns)

#データフレームの列の絞り込み
## 'IP'列のみに絞り込む例
print(df['IP'])

#データフレームの行を条件で絞り込む
## ==
## IP列が'192.168.1.234'の例 ------------------------[1]
print(df[df['IP'] == '192.168.1.234'])

## not → 「~」
## IP列が'192.168.1.234'以外の例
print(df[~(df['IP'] == '192.168.1.234')])

#データフレームの行を複数条件で絞り込む
condition_1=df['IP'] == '192.168.0.123'
condition_2=df['IP'] == '192.168.0.456'
## and → 「&」
print(df[condition_1 & condition_2])
## or → 「|」
print(df[condition_1 | condition_2])

#queryを使用する方法
print(df.query("IP == '192.168.0.123' and Name =='TEST'"))
print(df.query("IP == '192.168.0.123' or Name =='TEST'"))


#データフレームから絞り込んだ値を取得
## IP列の0行目の値を取得
print(df['IP'].to_list()[0])

## 上記[1]を1行で
print(df[df['IP'] == '192.168.1.234']['IP'].to_list()[0])

# dfの行数分回す  shape[0]:行数
for i in range(df.shape[0]):
  # iloc :行 の"colname"を .to_dict辞書型で取得
  hoge=df.iloc[i].to_dict()["colname"]

## csv へ書き出し
df.to_csv('writeToCSV_test.csv', encoding = 'shift-jis')
df.to_csv('writeToCSV_test.csv', encoding = 'utf-8')

Excelからデータを読み込む

pip3 install xlrd==1.2.0

注意
https://learn.microsoft.com/ja-jp/azure/databricks/kb/libraries/xlsx-file-not-supported-xlrd

xlrd 2.0.0 以上で読み取れるのは .xls ファイルのみです。
潜在的なセキュリティの脆弱性により、.xlsx ファイルのサポートは xlrd から削除されました。

df= pd.read_excel('data.xlsx')
print(df)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up