Object Detection APIで勾配降下法を変更したり、データ拡張を行う

Posted at 2018-10-03

Object Detection APIで勾配降下法を変更したり、データ拡張を行う

Object Detection APIのデータ拡張や勾配降下法の変更方法が、あまり投稿されていなかったので、メモしておきます。

対象者

Object Detection APIを使ったことがある方

勾配降下法を変更する

デフォルトの状態

下記デフォルトの状態で、70行目あたりに下記のコードがあると思います。
この中のoptimizerの箇所を変更していきます。

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }

Adamに置き換えてみる

momentumのoptimizer_valueは不要な部分なのでごっそり削っています。

train_config: {
  batch_size: 24
  optimizer {
    adam_optimizer: {
      learning_rate {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
        }
      }
    }
  }

自分の場合、これで多少精度はあがりました。

下記に他の勾配効果法の設定も記載されています。
https://github.com/tensorflow/models/blob/master/research/object_detection/protos/optimizer.proto

データ拡張を行う

データ拡張とは？

移動、回転、拡大・縮小などの人工的な操作を加えて、学習するデータを増やす手法です。
データがネックになる教師あり学習においては、非常に有効な手段となります。

詳しくは下記の記事がわかりやすかったので、チェックしてみてください。
http://aidiary.hatenablog.com/entry/20161212/1481549365

デフォルトの状態

下記がデフォルトの状態で、164~167行目あたりにコードがあると思います。
この中のdata_augmentation_optionsというところを変更します。
デフォルトでは、random_horizontal_flip だけ設定されています。

 data_augmentation_options {
    random_horizontal_flip {
    }
  }

実際に設定する

今回は、下記2つを追加してみました！
random_horizontal_flip
画像を水平方向にランダムに反転させる。
random_rgb_to_gray
ランダムにrgbの値をグレーに変換する...(自身ない)

 data_augmentation_options {
    random_horizontal_flip {
     keypoint_flip_permutation: 1
    }
    random_rgb_to_gray {
     probability: 0.5
    }
  }

下記からいろいろ選択できるみたいですね。
https://github.com/tensorflow/models/blob/master/research/object_detection/protos/preprocessor.proto
https://github.com/tensorflow/models/blob/master/research/object_detection/core/preprocessor.py

これを設定することで、私の場合、20%の検出が、50%くらいに上がりました。

わからなかったこと

random_horizontal_flipやrandom_vertical_flipで出てくる、
keypoint_flip_permutationの値が、0 ~ 5で選択できるみたいですが、この数値の違いがわかりませんでした。

ファイルでは、下記のように設定しているよう。
しかし説明がわからん......w

message RandomHorizontalFlip {
  // Specifies a mapping from the original keypoint indices to horizontally
  // flipped indices. This is used in the event that keypoints are specified,
  // in which case when the image is horizontally flipped the keypoints will
  // need to be permuted. E.g. for keypoints representing left_eye, right_eye,
  // nose_tip, mouth, left_ear, right_ear (in that order), one might specify
  // the keypoint_flip_permutation below:
  // keypoint_flip_permutation: 1
  // keypoint_flip_permutation: 0
  // keypoint_flip_permutation: 2
  // keypoint_flip_permutation: 3
  // keypoint_flip_permutation: 5
  // keypoint_flip_permutation: 4
  repeated int32 keypoint_flip_permutation = 1;
}

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up