More than 5 years have passed since last update.

芸能人の顔を機械学習で分類してみた（その１）

Last updated at 2018-04-05Posted at 2018-03-25

データ収集

googleの画像検索APIを使用
※googleのAPIkeyを既に持っていたからという安易な理由
※APIの仕様で同じ検索ワードで100枚以下の画像しか収集できない模様
　→次回はbing API(最大800枚収集できるらしい)を使って画像収集

結果

石原さとみの画像：８６枚
水原希子の画像：８７枚

google_api.py

# -*- coding:utf-8 -*-
import urllib.request
import httplib2
import json
import os
import pickle
import hashlib
import sha3

from googleapiclient.discovery import build


def make_dir(path):
    if not os.path.isdir(path):
        os.mkdir(path)


def make_correspondence_table(correspondence_table, original_url, hashed_url):
    correspondence_table[original_url] = hashed_url


def getImageUrl(api_key, cse_key, search_word, page_limit, save_dir_path):

    service = build("customsearch", "v1", developerKey=api_key)
    page_limit = page_limit
    startIndex = 1
    response = []

    img_list = []

    make_dir(save_dir_path)
    save_res_path = os.path.join(save_dir_path, 'api_response_file')
    make_dir(save_res_path)

    for nPage in range(0, page_limit):
        print("Reading page number:", nPage + 1)

        try:
            response.append(service.cse().list(
                q=search_word,     # Search words
                cx=cse_key,        # custom search engine key
                lr='lang_ja',      # Search language
                num=10,            # Number of images obtained by one request (Max 10)
                start=startIndex,
                searchType='image' # search for images
            ).execute())

            startIndex = response[nPage].get("queries").get("nextPage")[0].get("startIndex")

        except Exception as e:
            print(e)

    with open(os.path.join(save_res_path, 'api_response.pickle'), mode='wb') as f:
        pickle.dump(response, f)

    for one_res in range(len(response)):
        if len(response[one_res]['items']) > 0:
            for i in range(len(response[one_res]['items'])):
                img_list.append(response[one_res]['items'][i]['link'])

    return img_list


def getImage(save_dir_path, img_list):
    make_dir(save_dir_path)
    save_img_path = os.path.join(save_dir_path, 'imgs')
    make_dir(save_img_path)

    opener = urllib.request.build_opener()
    http = httplib2.Http(".cache")

    for i in range(len(img_list)):
        try:
            url = img_list[i]
            extension = os.path.splitext(img_list[i])[-1]
            if extension.lower() in ('.jpg', '.jpeg', '.gif', '.png', '.bmp'):
                encoded_url = url.encode('utf-8')  # required encoding for hashed
                hashed_url = hashlib.sha3_256(encoded_url).hexdigest()
                full_path = os.path.join(save_img_path, hashed_url + extension.lower())

                response, content = http.request(url)
                with open(full_path, 'wb') as f:
                    f.write(content)
                print('saved image... {}'.format(url))

                make_correspondence_table(correspondence_table, url, hashed_url)

        except:
            print("failed to download images.")
            continue


if __name__ == '__main__':
    # -------------- Parameter and Path Settings -------------- #
    API_KEY = 'xxxxx'
    CUSTOM_SEARCH_ENGINE = 'xxxxxx'

    page_limit = 10
    search_word = '石原さとみ'
    save_dir_path = './img1'

    correspondence_table = {}

    img_list = getImageUrl(API_KEY, CUSTOM_SEARCH_ENGINE, search_word, page_limit, save_dir_path)
    getImage(save_dir_path, img_list)

    correspondence_table_path = os.path.join(save_dir_path, 'corr_table')
    make_dir(correspondence_table_path)

    with open(os.path.join(correspondence_table_path, 'corr_table.json'), mode='w') as f:
        json.dump(correspondence_table, f)

顔だけをトリミング

openCVを使って顔部分の切り出しを実行

face_trim.py

# -*- coding: utf-8 -*-

import sys
import cv2

# 引数格納
params = sys.argv
argc = len(params)

if(argc != 2):
    print('引数を指定して実行してください。')
    quit()

# 画像ディレクトリ定義
inDir = "./data/imgs/"
outDir = "./data/face_only/"
errDir = "./data/error/"

# カスケード分類器ロード　※必要に応じて変更
cascade_path = "/Users/yuni/anaconda/lib/python3.6/site-packages/cv2/data/haarcascade_frontalface_alt.xml"

# デフォルトで含まれている顔認識データセット
# haarcascade_frontalface_default.xml
# haarcascade_frontalface_alt.xml
# haarcascade_frontalface_alt2.xml
# haarcascade_frontalface_alt_tree.xml
# haarcascade_profileface.xml

image_path = inDir + params[1]

print(image_path)

# ファイル読み込み
image = cv2.imread(image_path)
if(image is None):
    print('画像を開けません。')
    quit()

# グレースケール変換
image_gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# カスケード分類器の特徴量を取得する
cascade = cv2.CascadeClassifier(cascade_path)


# 物体認識（顔認識）の実行
facerect = cascade.detectMultiScale(image_gray, scaleFactor=1.1, minNeighbors=1, minSize=(1, 1))

if len(facerect) == 1:
    print ("顔認識に成功しました。")
    print (facerect)

    #検出した顔の処理
    for rect in facerect:
        #顔だけ切り出して保存
        x = rect[0]
        y = rect[1]
        width = rect[2]
        height = rect[3]
        dst = image[y:y+height, x:x+width]
        new_image_path = outDir + params[1]
        cv2.imwrite(new_image_path, dst)

elif len(facerect) > 1:
    # 複数顔が検出された場合はスキップ
    print ("顔が複数認識されました")
    print (facerect)

    if len(facerect) > 0:
        color = (255, 255, 255) #白
        for rect in facerect:
            #検出した顔を囲む矩形の作成
            cv2.rectangle(image, tuple(rect[0:2]),tuple(rect[0:2] + rect[2:4]), color, thickness=2)

        #認識結果の保存
        new_image_path = errDir + params[1]
        cv2.imwrite(new_image_path, image)

    quit()

else:
    # 顔検出に失敗した場合もスキップ
    print ("顔が認識できません。")
    quit()

プログラムをシェルスクリプトで実行

face_trim.sh

# !/bin/bash

# 画像ディレクトリ定義
out='./data/imgs/'

# 画像処理スクリプト名定義
script='face_trim.py'

for file in `ls ${out}`; do
    python ${script} ${file}
done

結果

石原さとみの画像：44枚
水原希子の画像：50枚

枚数が減った原因

顔認識のエラー(25枚程度ずつ）
収集した元の画像に重複があったためデータとして採用しない
サングラスをかけていた画像を削除
若い（幼少期）の画像を削除
本人でない画像を削除

下準備（train/testデータの仕分けとラベルづけテキストの作成）

数が少ないので、手作業で実施

trainデータ：石原さとみ(34枚), 水原希子(29枚)
testデータ：石原さとみ(16枚), 水原希子(15枚)

機械学習の実行

testFaceType.py

# !/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import cv2
import numpy as np
import tensorflow as tf
import tensorflow.python.platform

NUM_CLASSES = 2
IMAGE_SIZE = 28
IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3

flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('train', 'train.txt', 'File name of train data')
flags.DEFINE_string('test', 'test.txt', 'File name of train data')
flags.DEFINE_string('train_dir', './data', 'Directory to put the training data.')
flags.DEFINE_integer('max_steps', 200, 'Number of steps to run trainer.')
flags.DEFINE_integer('batch_size', 10, 'Batch size'
                     'Must divide evenly into the dataset sizes.')
flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.')

def inference(images_placeholder, keep_prob):
    """ 予測モデルを作成する関数

    引数: 
      images_placeholder: 画像のplaceholder
      keep_prob: dropout率のplace_holder

    返り値:
      y_conv: 各クラスの確率(のようなもの)
    """
    # 重みを標準偏差0.1の正規分布で初期化
    def weight_variable(shape):
      initial = tf.truncated_normal(shape, stddev=0.1)
      return tf.Variable(initial)

    # バイアスを標準偏差0.1の正規分布で初期化
    def bias_variable(shape):
      initial = tf.constant(0.1, shape=shape)
      return tf.Variable(initial)

    # 畳み込み層の作成
    def conv2d(x, W):
      return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

    # プーリング層の作成
    def max_pool_2x2(x):
      return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                            strides=[1, 2, 2, 1], padding='SAME')
    
    # 入力を28x28x3に変形
    x_image = tf.reshape(images_placeholder, [-1, 28, 28, 3])

    # 畳み込み層1の作成
    with tf.name_scope('conv1') as scope:
        W_conv1 = weight_variable([5, 5, 3, 32])
        b_conv1 = bias_variable([32])
        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)

    # プーリング層1の作成
    with tf.name_scope('pool1') as scope:
        h_pool1 = max_pool_2x2(h_conv1)
    
    # 畳み込み層2の作成
    with tf.name_scope('conv2') as scope:
        W_conv2 = weight_variable([5, 5, 32, 64])
        b_conv2 = bias_variable([64])
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)

    # プーリング層2の作成
    with tf.name_scope('pool2') as scope:
        h_pool2 = max_pool_2x2(h_conv2)

    # 全結合層1の作成
    with tf.name_scope('fc1') as scope:
        W_fc1 = weight_variable([7*7*64, 1024])
        b_fc1 = bias_variable([1024])
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
        # dropoutの設定
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    # 全結合層2の作成
    with tf.name_scope('fc2') as scope:
        W_fc2 = weight_variable([1024, NUM_CLASSES])
        b_fc2 = bias_variable([NUM_CLASSES])

    # ソフトマックス関数による正規化
    with tf.name_scope('softmax') as scope:
        y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

    # 各ラベルの確率のようなものを返す
    return y_conv

def loss(logits, labels):
    """ lossを計算する関数

    引数:
      logits: ロジットのtensor, float - [batch_size, NUM_CLASSES]
      labels: ラベルのtensor, int32 - [batch_size, NUM_CLASSES]

    返り値:
      cross_entropy: 交差エントロピーのtensor, float

    """

    # 交差エントロピーの計算
    cross_entropy = -tf.reduce_sum(labels*tf.log(logits))
    # TensorBoardで表示するよう指定
    tf.summary.scalar("cross_entropy", cross_entropy)
    return cross_entropy

def training(loss, learning_rate):
    """ 訓練のOpを定義する関数

    引数:
      loss: 損失のtensor, loss()の結果
      learning_rate: 学習係数

    返り値:
      train_step: 訓練のOp

    """

    train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    return train_step

def accuracy(logits, labels):
    """ 正解率(accuracy)を計算する関数

    引数: 
      logits: inference()の結果
      labels: ラベルのtensor, int32 - [batch_size, NUM_CLASSES]

    返り値:
      accuracy: 正解率(float)

    """
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    tf.summary.scalar("accuracy", accuracy)
    return accuracy

if __name__ == '__main__':
    # ファイルを開く
    f = open(FLAGS.train, 'r')
    # データを入れる配列
    train_image = []
    train_label = []
    for line in f:
        # 改行を除いてスペース区切りにする
        line = line.rstrip()
        l = line.split()
        # データを読み込んで28x28に縮小
        img = cv2.imread(l[0])
        if(img is None):
            print('画像を開けません。')
            quit()
        # print(l[0])
        #表示
        img = cv2.resize(img, (28, 28))
        # cv2.imshow('image', img)
        # cv2.waitKey(0)
        # 一列にした後、0-1のfloat値にする
        train_image.append(img.flatten().astype(np.float32)/255.0)
        # ラベルを1-of-k方式で用意する
        tmp = np.zeros(NUM_CLASSES)
        tmp[int(l[1])] = 1
        train_label.append(tmp)
    # numpy形式に変換
    train_image = np.asarray(train_image)
    train_label = np.asarray(train_label)
    f.close()

    f = open(FLAGS.test, 'r')
    test_image = []
    test_label = []
    for line in f:
        line = line.rstrip()
        l = line.split()
        img = cv2.imread(l[0])
        img = cv2.resize(img, (28, 28))
        test_image.append(img.flatten().astype(np.float32)/255.0)
        tmp = np.zeros(NUM_CLASSES)
        tmp[int(l[1])] = 1
        test_label.append(tmp)
    test_image = np.asarray(test_image)
    test_label = np.asarray(test_label)
    f.close()
    
    with tf.Graph().as_default():
        # 画像を入れる仮のTensor
        images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS))
        # ラベルを入れる仮のTensor
        labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES))
        # dropout率を入れる仮のTensor
        keep_prob = tf.placeholder("float")

        # inference()を呼び出してモデルを作る
        logits = inference(images_placeholder, keep_prob)
        # loss()を呼び出して損失を計算
        loss_value = loss(logits, labels_placeholder)
        # training()を呼び出して訓練
        train_op = training(loss_value, FLAGS.learning_rate)
        # 精度の計算
        acc = accuracy(logits, labels_placeholder)

        # 保存の準備
        saver = tf.train.Saver()
        # Sessionの作成
        sess = tf.Session()
        # 変数の初期化
        sess.run(tf.global_variables_initializer())
        # TensorBoardで表示する値の設定
        summary_op = tf.summary.merge_all()
        summary_writer = tf.summary.FileWriter(FLAGS.train_dir, sess.graph)
        
        # 訓練の実行
        for step in range(FLAGS.max_steps):
            for i in range(int(len(train_image)/FLAGS.batch_size)):
                # batch_size分の画像に対して訓練の実行
                batch = FLAGS.batch_size*i
                # feed_dictでplaceholderに入れるデータを指定する
                sess.run(train_op, feed_dict={
                  images_placeholder: train_image[batch:batch+FLAGS.batch_size],
                  labels_placeholder: train_label[batch:batch+FLAGS.batch_size],
                  keep_prob: 0.5})

            # 1 step終わるたびに精度を計算する
            train_accuracy = sess.run(acc, feed_dict={
                images_placeholder: train_image,
                labels_placeholder: train_label,
                keep_prob: 1.0})
            print ("step %d, training accuracy %g"%(step, train_accuracy))

            # 1 step終わるたびにTensorBoardに表示する値を追加する
            summary_str = sess.run(summary_op, feed_dict={
                images_placeholder: train_image,
                labels_placeholder: train_label,
                keep_prob: 1.0})
            summary_writer.add_summary(summary_str, step)

    # 訓練が終了したらテストデータに対する精度を表示
    print ("test accuracy %g"%sess.run(acc, feed_dict={
        images_placeholder: test_image,
        labels_placeholder: test_label,
        keep_prob: 1.0}))

    # 最終的なモデルを保存
    save_path = saver.save(sess, "model.ckpt")

結果

$python3 testFaceType.py
2018-03-25 15:02:40.874719: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-03-25 15:02:40.874760: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-03-25 15:02:40.874772: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-03-25 15:02:40.874781: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-03-25 15:02:40.874789: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
step 0, training accuracy 0.460317
step 1, training accuracy 0.539683
step 2, training accuracy 0.539683
step 3, training accuracy 0.634921
step 4, training accuracy 0.68254
step 5, training accuracy 0.571429
step 6, training accuracy 0.714286
step 7, training accuracy 0.730159
step 8, training accuracy 0.809524
step 9, training accuracy 0.809524
step 10, training accuracy 0.84127
step 11, training accuracy 0.873016
step 12, training accuracy 0.873016
step 13, training accuracy 0.873016
step 14, training accuracy 0.857143
step 15, training accuracy 0.888889
step 16, training accuracy 0.888889
step 17, training accuracy 0.888889
step 18, training accuracy 0.904762
step 19, training accuracy 0.714286
step 20, training accuracy 0.888889
step 21, training accuracy 0.936508
step 22, training accuracy 0.936508
step 23, training accuracy 0.952381
step 24, training accuracy 0.952381
step 25, training accuracy 0.984127
step 26, training accuracy 0.952381
step 27, training accuracy 0.968254
step 28, training accuracy 1
step 29, training accuracy 0.952381
step 30, training accuracy 0.984127
step 31, training accuracy 0.952381
step 32, training accuracy 1
step 33, training accuracy 0.952381
step 34, training accuracy 0.952381
step 35, training accuracy 0.952381
step 36, training accuracy 1
step 37, training accuracy 1
step 38, training accuracy 1
step 39, training accuracy 0.952381
step 40, training accuracy 0.952381
step 41, training accuracy 0.968254
step 42, training accuracy 0.952381
step 43, training accuracy 0.968254
step 44, training accuracy 0.968254
step 45, training accuracy 0.968254
step 46, training accuracy 0.968254
step 47, training accuracy 0.968254
step 48, training accuracy 0.984127
step 49, training accuracy 1
step 50, training accuracy 1
step 51, training accuracy 0.968254
step 52, training accuracy 0.952381
step 53, training accuracy 0.968254
step 54, training accuracy 0.968254
step 55, training accuracy 0.968254
step 56, training accuracy 1
step 57, training accuracy 1
step 58, training accuracy 0.984127
step 59, training accuracy 1
step 60, training accuracy 1
step 61, training accuracy 1
step 62, training accuracy 0.984127
step 63, training accuracy 0.984127
step 64, training accuracy 1
step 65, training accuracy 1
step 66, training accuracy 0.984127
step 67, training accuracy 1
step 68, training accuracy 1
step 69, training accuracy 1
step 70, training accuracy 0.984127
step 71, training accuracy 1
step 72, training accuracy 1
step 73, training accuracy 1
step 74, training accuracy 1
step 75, training accuracy 0.968254
step 76, training accuracy 1
step 77, training accuracy 1
step 78, training accuracy 1
step 79, training accuracy 1
step 80, training accuracy 0.984127
step 81, training accuracy 0.968254
step 82, training accuracy 0.968254
step 83, training accuracy 0.984127
step 84, training accuracy 1
step 85, training accuracy 1
step 86, training accuracy 1
step 87, training accuracy 0.984127
step 88, training accuracy 1
step 89, training accuracy 1
step 90, training accuracy 0.984127
step 91, training accuracy 0.984127
step 92, training accuracy 1
step 93, training accuracy 1
step 94, training accuracy 1
step 95, training accuracy 0.984127
step 96, training accuracy 0.984127
step 97, training accuracy 0.984127
step 98, training accuracy 0.984127
step 99, training accuracy 0.984127
step 100, training accuracy 0.984127
step 101, training accuracy 0.984127
step 102, training accuracy 1
step 103, training accuracy 1
step 104, training accuracy 0.984127
step 105, training accuracy 0.984127
step 106, training accuracy 0.984127
step 107, training accuracy 0.984127
step 108, training accuracy 0.984127
step 109, training accuracy 0.984127
step 110, training accuracy 1
step 111, training accuracy 1
step 112, training accuracy 1
step 113, training accuracy 0.984127
step 114, training accuracy 0.968254
step 115, training accuracy 0.984127
step 116, training accuracy 1
step 117, training accuracy 1
step 118, training accuracy 1
step 119, training accuracy 0.984127
step 120, training accuracy 0.984127
step 121, training accuracy 0.984127
step 122, training accuracy 0.984127
step 123, training accuracy 0.984127
step 124, training accuracy 1
step 125, training accuracy 1
step 126, training accuracy 1
step 127, training accuracy 1
step 128, training accuracy 0.984127
step 129, training accuracy 0.984127
step 130, training accuracy 1
step 131, training accuracy 1
step 132, training accuracy 1
step 133, training accuracy 0.984127
step 134, training accuracy 1
step 135, training accuracy 1
step 136, training accuracy 1
step 137, training accuracy 1
step 138, training accuracy 1
step 139, training accuracy 0.968254
step 140, training accuracy 0.968254
step 141, training accuracy 0.984127
step 142, training accuracy 1
step 143, training accuracy 1
step 144, training accuracy 1
step 145, training accuracy 1
step 146, training accuracy 1
step 147, training accuracy 0.984127
step 148, training accuracy 0.984127
step 149, training accuracy 0.984127
step 150, training accuracy 1
step 151, training accuracy 1
step 152, training accuracy 1
step 153, training accuracy 1
step 154, training accuracy 0.984127
step 155, training accuracy 0.984127
step 156, training accuracy 0.984127
step 157, training accuracy 0.984127
step 158, training accuracy 1
step 159, training accuracy 1
step 160, training accuracy 1
step 161, training accuracy 1
step 162, training accuracy 1
step 163, training accuracy 1
step 164, training accuracy 1
step 165, training accuracy 1
step 166, training accuracy 1
step 167, training accuracy 1
step 168, training accuracy 1
step 169, training accuracy 1
step 170, training accuracy 0.984127
step 171, training accuracy 1
step 172, training accuracy 1
step 173, training accuracy 1
step 174, training accuracy 1
step 175, training accuracy 1
step 176, training accuracy 1
step 177, training accuracy 1
step 178, training accuracy 1
step 179, training accuracy 1
step 180, training accuracy 1
step 181, training accuracy 1
step 182, training accuracy 1
step 183, training accuracy 1
step 184, training accuracy 0.984127
step 185, training accuracy 0.984127
step 186, training accuracy 1
step 187, training accuracy 1
step 188, training accuracy 1
step 189, training accuracy 1
step 190, training accuracy 1
step 191, training accuracy 1
step 192, training accuracy 1
step 193, training accuracy 1
step 194, training accuracy 1
step 195, training accuracy 1
step 196, training accuracy 0.984127
step 197, training accuracy 0.984127
step 198, training accuracy 1
step 199, training accuracy 1
test accuracy 0.833333

testデータの正解率83%でした。(本当??これは過学習ですか？）
とりあえず動かすところまではできました、というレベルです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up