【Facenet】特徴量抽出:顔認識と分類のための統一的な埋め込み

ニューラルネットワーク

Last updated at 2019-03-19Posted at 2019-03-19

顔認識と分類のための統一的な埋め込み

FaceNet: A Unified Embedding for Face Recognition and Clustering

Abstract

Despite significant recent advances in the field of face
recognition [10, 14, 15, 17], implementing face verification
and recognition efficiently at scale presents serious challenges
to current approaches. In this paper we present a
system, called FaceNet, that directly learns a mapping from
face images to a compact Euclidean space where distances
directly correspond to a measure of face similarity. Once
this space has been produced, tasks such as face recognition,
verification and clustering can be easily implemented
using standard techniques with FaceNet embeddings as feature
vectors.
Our method uses a deep convolutional network trained
to directly optimize the embedding itself, rather than an intermediate
bottleneck layer as in previous deep learning
approaches. To train, we use triplets of roughly aligned
matching / non-matching face patches generated using a
novel online triplet mining method. The benefit of our
approach is much greater representational efficiency: we
achieve state-of-the-art face recognition performance using
only 128-bytes per face.
On the widely used Labeled Faces in the Wild (LFW)
dataset, our system achieves a new record accuracy of
99.63%. On YouTube Faces DB it achieves 95.12%. Our
system cuts the error rate in comparison to the best published
result [15] by 30% on both datasets.
We also introduce the concept of harmonic embeddings,
and a harmonic triplet loss, which describe different versions
of face embeddings (produced by different networks)
that are compatible to each other and allow for direct comparison
between each other.

次にあげるもの[10, 14, 15, 17]は近年の顔認識の分野における大きな進歩ですが、大規模な場合では現在のアプローチには顔の検証および認識を効率的に行うことに大きな課題があります。この論文ではFacenetと呼ばれるシステムを紹介します。このシステムは、距離が顔の類似性の尺度に直接対応する小さなユークリッド空間への顔画像からのマッピングを直接学習します。一度このユークリッド空間が生成されると顔の認識、検証、クラスタリング(分類)などのタスクがFacenetの埋め込みを特徴ベクトルとする標準的な手法を用いて簡単に実行できます。私たちの方法では、deep convolutional networkを使用して、以前のdeep learningのアプローチのように中間ボトルネック層ではなく埋め込み自体を直接最適化するようにされています。訓練のために、私たちは、新しいオンライントリプレットマイニング手法を使用して生成された、ほぼ整列したマッチング/非マッチングのface patchのトリプレットを使用します。私たちのアプローチ優れている点は、表現効率がはるかに優れている点です。1つの顔に128bytesしか使用しない最先端の顔認識性能を達成しています。広く使われているラベル付顔データセットのワイルド（LFW）データセットでは、99.63%という精度の新記録を達成しています。YouTubeの顔データベースでは95.12%を達成しています。私たちのシステムは上記の両方のデータセットで最高のものとして公表された結果[15]と比較してエラー率を30%削減します。また私たちはお互いに互換性があり、お互いの直接の比較を可能にする顔埋め込み（異なるネットワークによって生成される）の異なるバージョンを記述するharmonic embeddingsとharmonic triplet lossの概念についても紹介します。

概要

顔画像から距離が直接顔の類似度とするようなユークリッド空間へのマッピングを学習
埋め込みをオンラインのトリプレット処理によって直接最適化
顔の認識：誰の顔か
顔の検証：画像が同じか
顔の分類：類似の画像
認識はk-NN分類問題でクラスタリングはk-meansなどを用いて達成できる

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up