Chainerで画像を読み込む際のTips

Chainer

Posted at 2016-12-30

1, 準備するデータ形式

ChainerのConvolution2Dなどを使う際には、データ形式を対応したものに整備しておく必要がある。しかし、Pythonの画像ライブラリであるPILとの整合性がないため、若干の注意が必要。

PIL

データは以下のようにして読み込める。

import Image
import numpy as np
# load from file
img = Image.open("XXX.jpg")
img.show()

# load from array
# xxx needs to be uint8
img = Image.fromarray(xxx,"RGB")
img.show()

また、加工も可能

img = img.resize((277,277), Image.ANTIALIAS)

PILのデータ形式に関して、次に述べるarray型との注意すべき違いは以下の通り：

並び順: (x,y,color)という並び順。32pixel四方の3色であれば、(32,32,3)となる。1000データ分ならんでいたとすると、(1000,32,32,3)
データ型: 0~255の整数を取る。0~1とすると正しく表示されないため注意が必要

Chainerへ入れる際の np.array

PILから以下のようにして変換する

# img はPILの画像
arrayImg = np.asarray(img).transpose(2,0,1).astype(np.float32)/255.

上記からもわかるが、以下の点がポイント

並び順: (color,x,y)という並び順。32pixel四方の3色であれば、(3,32,32)となる。
データ型: 0~1のfloat32型で定義しておかないとChainerがエラーを吐く。

その他、tips

arrayのreshapeに-1を指定すると、適切な数字を計算してもらえるので便利。 arrayImgList.reshape(-1,3,32,32) など
Chainerで引数に取る画像はfloat32でなくてはいけない一方、ラベルの引数 (softmaxに取る際の値)は、int32であることが必要
上では、transpose(2,0,1)によりPIL->np.arrayの変換を行ったが、逆にnp.array->PILを行う場合は、transpose(1,2,0)を行う

2, イテレータの使い方

from chainer.datasets import tuple_dataset
# train_x は、(1000, 3,32,32)のfloat32
# train_y は、(1000, )のint32
train = tuple_dataset.TupleDataset(train_x,train_y)
...
train_iter = chainer.iterators.SerialIterator(train, batch_size=100, repeat=True, shuffle=True)
...
updater = training.StandardUpdater(train_iter, optimizer, device=args.gpu, converter=myConverter)

などとして、train_iterを作成し、それをupdaterへ引き渡す。itaratorは、train_x、train_yより1つづつの要素をとり、[(x1,y1), (x2,y2), ..., (x_b,y_b)] (bはバッチサイズ)のlistを返す。

3, 実行時にデータを変換・読み込み

上で converter=myConverterと変換関数を指定したが、ここで実行時のデータ変換やデータの読み込みを行うことができる。
AlexNet(227pixelがもともと想定されていた)に対して、CIFARのデータセット (32pixel)を使おうとすると、アップサンプリングの必要が出てくるが、それをあらかじめイテレータに読み込ませようとすると、メモリを使い切ってしまうため、実行時にデータのresizeを行う方が良い。
また、ファイル名をイテレータの中に入れておいて、実行時batchの読み込み段階でディスクから読み込むということで、さらに大きなサイズのファイルを使うこともできる。

from chainer.dataset.convert import concat_examples
def myConverter(batch,device=None,padding=None):
	newBatch = []
   	for dat in batch:
   		x = dat[0] # xはint32のPIL画像
    	y = dat[1]
    	img = Image.fromarray(x.transpose(1,2,0),"RGB")
      	img = img.resize((277,277), Image.ANTIALIAS)$
      	imarray = np.asarray(img).transpose(2,0,1).astype(np.float32) / 255.
      	newBatch.append((imarray,index))
      	# concat_examplesで、通常のバッチに対する処理を追加
      	return concat_examples(buf, device=device, padding=padding)

updater = training.StandardUpdater(train_iter, optimizer, device=args.gpu, converter=myConverter)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up