Qiita Teams that are logged in
You are not logged in to any team

Log in to Qiita Team
Community
OrganizationEventAdvent CalendarQiitadon (β)
Service
Qiita JobsQiita ZineQiita Blog
8
Help us understand the problem. What are the problem?

More than 5 years have passed since last update.

@naoya_t

CSVを1行ずつフィールドの型変換をしながら読み込んで(プログレスバーも表示しながら)処理したい

プログレスバーを表示しながら、CSVを1行ずつフィールドの型変換をしながら読んで何か処理をする例。
プログレスバーはclickのやつを。

とりあえずコード貼っときます
(with なんとか... で使えるiteratorの書き方の自分用コピペ例でもある)

field_converter.py はこちら:https://gist.github.com/naoyat/3db8cd96c8dcecb5caea
前の記事「Pythonで "文字列".split() した結果を一括変換したい」のやつです

csv_iterator.py
import sys
import csv
import click
from field_converter import FieldConverter

class CSV_Iterator:
    def __init__(self, path, skip_header=False, with_progress_bar=False,
                 field_converter=None):
        self.path = path
        self.with_progress_bar = with_progress_bar
        self.field_converter = field_converter

        self.f = open(path, 'r')
        self.line_count = sum(1 for line in self.f)

        self.f.seek(0)  # rewind
        self.r = csv.reader(self.f, dialect='excel')
        if skip_header:
            self.r.next()
            self.line_count -= 1

        print '(%d lines)' % (self.line_count,)

        if self.with_progress_bar:
            self.bar = click.progressbar(self.r, self.line_count)

    def __iter__(self):
        return self

    def next(self):
        try:
            if self.with_progress_bar:
                fields = self.bar.next()
            else:
                fields = self.r.next()
            if self.field_converter:
                try:
                    fields = self.field_converter.convert(fields)
                except:
                    print sys.exc_info()
            return fields
        except:
            raise StopIteration

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type:
            return False
        if self.with_progress_bar:
            print
        self.f.close()
        return True

gistに貼っておきます。
https://gist.github.com/naoyat/b1290d917638c412e140

使用例。

example.py
from csv_iterator import CSV_Iterator

def foobar(csv_path):
    with CSV_Iterator(csv_path,
                  skip_header=True,
                  with_progress_bar=True,
                  field_converter=FieldConverter(int, int, 'iso-8859-1', 'iso-8859-1', float)) as line:
        for id, uid, title, query, target in line:
            ...
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
8
Help us understand the problem. What are the problem?