More than 5 years have passed since last update.

NumPyに入門 3 CSVから抽出

Last updated at 2019-04-14Posted at 2019-04-14

今回はNumPyでCSVファイルを読み込んで、データの整形を行います。
以下からiris.csvをダウンロードします。

CSVのデータを全て読み込んでiris_dataに入れます。

.py

import numpy as np
import csv

with open('iris.csv', 'r') as csv_file:
    reader = csv.reader(csv_file, delimiter = ",", quotechar = '"')
    csv_data = [row for row in reader]

iris_data = np.asarray(csv_data)

読み込んだままだと以下のようにヘッダーがある状態です。

.py


>>> iris_data[:5]
array([['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Name'],
       ['5.1', '3.5', '1.4', '0.2', 'Iris-setosa'],
       ['4.9', '3.0', '1.4', '0.2', 'Iris-setosa'],
       ['4.7', '3.2', '1.3', '0.2', 'Iris-setosa'],
       ['4.6', '3.1', '1.5', '0.2', 'Iris-setosa']], dtype='<U15')

iris_data[1:] で1行目以降だけを返して新しいNumPy行列を作ります。

.py

>>> iris_data = iris_data[1:]
>>> iris_data[:5]
array([['5.1', '3.5', '1.4', '0.2', 'Iris-setosa'],
       ['4.9', '3.0', '1.4', '0.2', 'Iris-setosa'],
       ['4.7', '3.2', '1.3', '0.2', 'Iris-setosa'],
       ['4.6', '3.1', '1.5', '0.2', 'Iris-setosa'],
       ['5.0', '3.6', '1.4', '0.2', 'Iris-setosa']], dtype='<U15')

数字の列だけを抽出して、新しい配列を作成します。

.py

>>> select = lambda row: [float(row[0]), float(row[1]), float(row[2]), float(row[3])]
>>> iris_data = np.array([select(row) for row in iris_data])
>>> iris_data[:5]
array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

集計してみます。

.py

>>> iris_data.sum()
2078.2
>>> iris_data.sum(axis = 0)
array([876.5, 458.1, 563.8, 179.8])
>>> iris_data.mean(axis = 0)
array([5.84333333, 3.054     , 3.75866667, 1.19866667])

整形したデータをファイルに保存して、あとで読み出します。ファイル名の拡張子は.npyがおすすめです。

.py

np.save('iris.npy', iris_data)

.py

>>> iris_data2 = np.load('iris.npy')
>>> iris_data2[:5]
array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

動画で説明を作りました。よかったらご覧ください。
https://youtu.be/sxObw5_pVvc

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up