PythonプログラムからCSVファイルをお手軽に読み込む方法。
方法
標準ライブラリのcsv
モジュールを使うと簡単。
Sample
Python 2/3の両方で動く。日本語対応有り。
csv_read.py
from __future__ import print_function
from __future__ import unicode_literals
import argparse
import csv
import io
import six
parser = argparse.ArgumentParser()
parser.add_argument("csv_file_path")
parser.add_argument("--encoding", default="utf_8")
options = parser.parse_args()
csv_reader = csv.reader(
io.open(options.csv_file_path, "r", encoding=options.encoding),
delimiter=",",
quotechar='"'
)
print("--- header ---\n{}\n".format(six.next(csv_reader)))
print("--- data ---")
for row in csv_reader:
print(row)
動作例: ascii
sample_ascii.csv
"Country or Area","Year","Area","Sex","Record Type","Reliability","Source Year","Value","Value Footnotes"
"Afghanistan","2014","Total","Both Sexes","Estimate - de facto","Final figure, incomplete/questionable reliability","2015","26556754","1"
"Afghanistan","2014","Total","Male","Estimate - de facto","Final figure, incomplete/questionable reliability","2015","13585933","1"
"Afghanistan","2014","Total","Female","Estimate - de facto","Final figure, incomplete/questionable reliability","2015","12970821","1"
$ ./csv_read.py sample_ascii.csv
--- header ---
['Country or Area', 'Year', 'Area', 'Sex', 'Record Type', 'Reliability', 'Source Year', 'Value', 'Value Footnotes']
--- data ---
['Afghanistan', '2014', 'Total', 'Both Sexes', 'Estimate - de facto', 'Final figure, incomplete/questionable reliability', '2015', '26556754', '1']
['Afghanistan', '2014', 'Total', 'Male', 'Estimate - de facto', 'Final figure, incomplete/questionable reliability', '2015', '13585933', '1']
['Afghanistan', '2014', 'Total', 'Female', 'Estimate - de facto', 'Final figure, incomplete/questionable reliability', '2015', '12970821', '1']
動作例: UTF-8
sample_utf8.csv
dataset_id,year,publisher,group_title,frequency_of_update,data_format,language,resource_count
12971,2015,消費者庁,行財政,年単位,PDF,英語,1
12971,2015,消費者庁,行財政,年単位,HTML,日本語,59
12971,2015,消費者庁,行財政,年単位,PDF,日本語,4
12972,2015,消費者庁,行財政,年単位,HTML,日本語,3
12972,2015,消費者庁,行財政,年単位,PDF,日本語,6
12973,2015,消費者庁,行財政,年単位,PDF,日本語,6
12974,2015,消費者庁,行財政,年単位,PDF,日本語,4
12975,2015,消費者庁,行財政,その他(自由記述),PDF,日本語,7
12976,2015,消費者庁,行財政,その他(自由記述),PDF,日本語,4
$ ./csv_read.py sample_utf8.csv
--- header ---
['dataset_id', 'year', 'publisher', 'group_title', 'frequency_of_update', 'data_format', 'language', 'resource_count']
--- data ---
['12971', '2015', '消費者庁', '行財政', '年単位', 'PDF', '英語', '1']
['12971', '2015', '消費者庁', '行財政', '年単位', 'HTML', '日本語', '59']
['12971', '2015', '消費者庁', '行財政', '年単位', 'PDF', '日本語', '4']
['12972', '2015', '消費者庁', '行財政', '年単位', 'HTML', '日本語', '3']
['12972', '2015', '消費者庁', '行財政', '年単位', 'PDF', '日本語', '6']
['12973', '2015', '消費者庁', '行財政', '年単位', 'PDF', '日本語', '6']
['12974', '2015', '消費者庁', '行財政', '年単位', 'PDF', '日本語', '4']
['12975', '2015', '消費者庁', '行財政', 'その他(自由記述)', 'PDF', '日本語', '7']
['12976', '2015', '消費者庁', '行財政', 'その他(自由記述)', 'PDF', '日本語', '4']
参考
encoding - Python 2 and 3 csv reader - Stack Overflow
http://stackoverflow.com/questions/5180555/python-2-and-3-csv-reader