Help us understand the problem. What is going on with this article?

read_csvでファイルを読み込む

More than 1 year has passed since last update.

列のデータ型の指定(converters)

read_csvで読み込む際にconvertersを使うとデータ型を指定できる。
convertersに変換パターンを辞書型で渡す。

pd.read_csv('input_file.tsv', sep='\t', converters={'col_name_a':str, 'col_name_b':str})

通常は使うことはまず無いが、読み込みで以下のようなWarningが出た時に試すと良い。
warningの内容(パスはちょっと変更)

python3.7/site-packages/IPython/core/interactiveshell.py:2785: DtypeWarning: Columns (23,24,25,33,34,35) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)

ファイルは読み込めているのでWarningっぽい。

データフレームの内容を確認すると、文字列が入る列にNullの代わりに「Nan」という文字列が入っていたのでPandasがstrとfloatが混在する列だと判断したのが原因のようだった。
カラムの列数は出ているので該当箇所をすべてstrでしたらwarningは消えた。

wwacky
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
No comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
ユーザーは見つかりませんでした