More than 5 years have passed since last update.

YOLO v3による顔検出：01.データセット準備

Last updated at 2018-12-03Posted at 2018-11-28

概要

Darknet YOLO v3をWIDER FACEデータセットで学習させてweightを作成
weightとYOLO v3ネットワークを使って、KerasにコンバートしたYOLO v3モデルを構築
Keras YOLO v3モデルで顔検出
過去に構築したモデルを使って、検出した顔画像から性別・人種・年齢を予測

これらのタスクを分割して掲載

YOLO v3による顔検出：01.データセット準備
YOLO v3による顔検出：02.Darknetで学習
[YOLO v3による顔検出：03.Kerasで予測]
(https://qiita.com/ha9kberry/items/f75bd95fb2322c00af10)

WIDER FACE
http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/

32,203枚の顔画像データセット
顔検出用Bounding Box Annotationのテキストファイル付き

学習済みモデル

性別分類モデル
https://qiita.com/ha9kberry/items/ae0eabc50a3974c2d92e
人種分類モデル
https://qiita.com/ha9kberry/items/d7d0b0468552b1b7a804
年齢回帰モデル
https://qiita.com/ha9kberry/items/314afb56ee7484c53e6f

実行環境
AWS EC2

instance type : p2.xlarge
AMI : Deep Learning AMI (Ubuntu) Version 18.0

参考
https://github.com/AlexeyAB/darknet
https://github.com/qqwweee/keras-yolo3

データセット準備

Annotationファイル作成

YOLO用のAnnotationテキストファイルを各画像ごとに作成
WIDER FACEデータセットのtrainデータから**0--Parade〜20--Family_Group（5451枚）**を抽出

generator.ipynb

import os
import cv2
import math
import random

# 予め0--Parade〜20--Family_Groupのみ抽出
txt_file = 'wider_face_split/wider_face_train_bbx_gt.txt'
with open(txt_file) as f:
    num = f.read().count('jpg')
print('number of images:',num)

number of images: 5451

Annotationファイル構成

各バウンディングボックスごとにクラス x座標 y座標ボックス幅ボックス高さを記載
座標はバウンディングボックスの中心座標
座標、幅・高さともに、画像の幅・高さに対する相対値
Annotationファイルは、画像ファイルと同じディレクトリ内に保存

generator.ipynb

def generator():

    with open(txt_file) as f:
        img_paths=[]
        
        for i in range(num):
            # 両端の空白や改行を除去して1行ずつ読み込む
            img_path=f.readline().strip()
            # 画像パス一覧取得
            img_paths.append(img_path)
            # 画像を読み込み幅・高さ取得
            im = cv2.imread(img_path)
            im_h=im.shape[0]
            im_w=im.shape[1]
            # '/'で分割
            split=img_path.split('/')
            # Annotationファイルを格納するディレクトリ取得
            dir_name=split[0]
            # Annotationファイル名作成
            file_name=split[1].replace('.jpg', '.txt')
            # ボックス数取得
            count = int(f.readline())
            readline=[]
            readlines=[]
            
            for j in range(count):
                readline=f.readline().split()
                # ボックスの左上座標を取得
                xmin=int(readline[0])
                ymin=int(readline[1])
                # ボックスの幅・高さを取得
                w=int(readline[2])
                h=int(readline[3])
                # ボックスの中央座標（相対値）を作成
                xcenter=str((xmin+w/2)/im_w)
                ycenter=str((ymin+h/2)/im_h)
                # ボックスの幅・高さを相対値に変換
                w=str(w/im_w)
                h=str(h/im_h)
                
                class_num='0'
                # クラス x座標 y座標 ボックス幅 ボックス高さを半角スペースで結合
                string=' '.join([class_num, xcenter, ycenter, w, h])
                readlines.append(string)
            # 改行で結合    
            readlines_str='\n'.join(readlines)
            # 該当するディレクトリ内にAnnotationファイルを保存
            with open(dir_name+'/'+file_name, 'w') as j:
                j.write(readlines_str)
                
    # 画像パス一覧を出力
    return img_paths

img_paths = generator()

0_Parade_marchingband_1_45.jpgのAnnotationファイル

0_Parade_marchingband_1_45.txt

0 0.68017578125 0.9162995594713657 0.0400390625 0.07342143906020558
0 0.02978515625 0.8127753303964758 0.0185546875 0.027900146842878122

trainデータ、testデータそれぞれで使う画像のパス一覧テキストファイルを作成

generator.ipynb

# testデータの比率
test_size=0.1
test_num=math.floor(num*test_size)
train_num=num-test_num
print('number of train images:', train_num)
print('number of test images:', test_num)

number of train images: 4906
number of test images: 545

generator.ipynb

# 画像データを格納しているディレクトリ
dir_path="face/"
img_paths=[dir_path+i for i in img_paths]

# trainファイル作成
train_list=img_paths[:train_num]
train_str='\n'.join(train_list)
with open('train.txt', 'w') as f:
    f.write(train_str)

# testファイル作成
test_list=img_paths[train_num:]
test_str='\n'.join(test_list)
with open('test.txt', 'w') as f:
    f.write(test_str)

train.txt

face/0--Parade/0_Parade_marchingband_1_849.jpg
face/0--Parade/0_Parade_Parade_0_904.jpg
face/0--Parade/0_Parade_marchingband_1_799.jpg
face/0--Parade/0_Parade_marchingband_1_117.jpg
face/0--Parade/0_Parade_marchingband_1_778.jpg
...

データセットのディレクトリ構成

face/
  ┣ 0--Parade/
  ┣ 1--Handshaking/
  ...
  ┣ 20--Family_Group/
  ┣ wider_face_split/
  ┃  ┗ wider_face_train_bbx_gt.txt
  ┣ generator.ipynb
  ┣ test.txt
  ┗ train.txt

このデータセットを使って、Daknet YOLO v3モデルを学習させる

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up