More than 5 years have passed since last update.

CIFAR-10の画像を変換してみる

Posted at 2019-04-18

CIFAR-10の画像を変換してみたという話

CIFAR-10とは

airplane(飛行機)，automobile(自動車)，bird(鳥)，cat(猫)，deer(鹿)，dog(犬)，frog(蛙)，horse(馬)，ship(船)，truck(トラック)のいずれかが写っているサイズが32x32のカラー画像を集めたデータセットのこと．こちらからダウンロードできる．

内容確認

CIFAR-10 python versionをダウンロードし解凍する．中身は以下の通り

cifar-10-batches-py
├── batches.meta  
├── data_batch_1  // training data 1
├── data_batch_2  // training data 2
├── data_batch_3  // training data 3
├── data_batch_4  // training data 4
├── data_batch_5  // training data 5
├── readme.html
└── test_batch    // testing data

batches.meta

{b'num_vis': 3072, b'num_cases_per_batch': 10000, b'label_names': [b'airplane', b'automobile', b'bird', b'cat', b'deer', b'dog', b'frog', b'horse', b'ship', b'truck']}

data_batch

すべて表示すると長くなるためキーだけ表示している．

dict_keys([b'filenames', b'data', b'batch_label', b'labels'])

data - 10,000x3,072の配列．配列の各行には32x32のカラー画像が格納されている．
labels - 0から9の範囲の10,000個の数字のリスト．数字の値はbatches.metaファイルのlabel_names配列のインデックスを表している．つまり，0なら飛行機の画像である．

画像の描画

まずは，画像を描画してみる．

drawing.py

import numpy as np
import matplotlib.pyplot as plt 

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo: 
        dict = pickle.load(fo, encoding='bytes')
    return dict

X = unpickle("./data_batch_1")[b'data']
X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("uint8")

plt.imshow(X[0])
plt.show()

このプログラムを解凍先のディレクトリで実行する．
実行結果は以下の通り．ちなみにこれは蛙の画像である．

画像の変換

次に画像の変換を行ってみる．

convert.py

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

X = unpickle("./data_batch_1")[b'data']
# (10000, 3072) => (10000, 3, 32, 32) => (10000, 32, 32, 3)
X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("uint8")

img = Image.fromarray(X[0])

# 元の画像を保存
img.save('./a.png')

# 画像を拡大
img_resize = img.resize((224,224))
img_resize.save('./b.png')

# グレイスケール
img_gray = img_resize.convert("L")
img_gray.save('./c.png')

# 2値化
# 230より下は0になる．
img_2 = img_gray.point(lambda x: 0 if x < 230 else x)
img_2.save('./d.png')

# 回転
img_roll = img_resize.rotate(90, expand=True)
img_roll.save('./e.png')

実行結果は以下の通り

処理	出力画像
元画像
拡大
グレイスケール
2値化
回転

参考

TensorFlow_CNN_3
PIL/Pillow チートシート

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up