@harugonosposted at 2023-09-14

クラス不明のデータに対するArcFaceを用いたクラス予測

Q&A

機械学習分類深層学習 ArcFace

解決したいこと

深層距離学習手法ArcFaceを用いて，クラス不明の未知データのクラス予測を行いたいです．

しかしArcFaceのアルゴリズム上，学習時に正解クラスの代表ベクトルとのcos類似度が大きくなるよう学習を行うため，入力には[画像，正解クラス]の二つが必要です．

そのような関係から，クラス不明のデータに対するクラス予測ができないでいます.

要はそのようなデータにおいて画像のみを入力し，クラスを予測できるようなモデルの実装方法が知りたいです．

以下に現在のArcFaceモデルを示します．

ArcFacelayer

class Arcfacelayer(Layer):
    # s:softmaxの温度パラメータ, m:margin
    def __init__(self, output_dim, s, m, easy_margin=False):
        self.output_dim = output_dim
        self.s = s
        self.m = m
        self.easy_margin = easy_margin
        super(Arcfacelayer, self).__init__()

    def get_config(self):
        config = {
            "output_dim" : self.output_dim,
            "s" : self.s,
            "m" : self.m,
            "easy_margin" : self.easy_margin
        }
        base_config = super().get_config()
        return dict(list(base_config.items()) + list(config.items()))
    

    # 重みの作成
    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernel = self.add_weight(name='kernel',
                                        shape=(input_shape[0][1], self.output_dim),
                                        initializer='uniform',
                                        trainable=True)
        super(Arcfacelayer, self).build(input_shape)

    def call(self, x):
        
        y = x[1]
        x_normalize = tf.math.l2_normalize(x[0]) # x = x'/ ||x'||2
        k_normalize = tf.math.l2_normalize(self.kernel) # Wj = Wj' / ||Wj'||2

        cos_m = K.cos(self.m)
        sin_m = K.sin(self.m)
        th = K.cos(np.pi - self.m)
        mm = K.sin(np.pi - self.m) * self.m

        cosine = K.dot(x_normalize, k_normalize) # W.Txの内積
        sine = K.sqrt(1.0 - K.square(cosine))

        phi = cosine * cos_m - sine * sin_m #cos(θ+m)の加法定理

        if self.easy_margin:
            phi = tf.where(cosine > 0, phi, cosine) 

        else:
            phi = tf.where(cosine > th, phi, cosine - mm) 

        # 正解クラス:cos(θ+m) 他のクラス:cosθ 
        output = (y * phi) + ((1.0 - y) * cosine) 
        output *= self.s

        return output

    def compute_output_shape(self, input_shape):
        return (input_shape[0][0], self.output_dim) #入力[x,y]のためx[0]はinput_shape[0][0]

ベースネットワークにはResNet50v2を使用しています．

ResNet50v2 + ArcFace定義

def create_arcface_with_resnet50v2(input_shape, s, m):
    
    weight_decay = 1e-4
    
    # ResNet50V2の入力層の前に独自の入力層を追加
    input_tensor = input_shape
    
    input_model = Sequential()
    input_model.add(InputLayer(input_shape=input_tensor))
    input_model.add(Conv2D(3, (7, 7), padding='same'))
    input_model.add(BatchNormalization())
    input_model.add(Activation('relu'))
    
    resnet50v2 = ResNet50V2(include_top=False, weights=None, input_tensor=input_model.output)
    
#     DLしてある重みの読み込み
    resnet50v2.load_weights('save_model(weights_imagenet)/weights_imagenet.hdf5', by_name=True)

    drop = Dropout(0.5)(resnet50v2.layers[-1].output)

    flat = Flatten()(drop)
    dense = Dense(512, kernel_initializer="he_normal", kernel_regularizer=regularizers.l2(weight_decay), name="hidden")(flat)
    
    x = BatchNormalization()(dense)
    
    yinput = Input(shape=(num_classes,)) #クラスも入力情報として使用
    
    s_cos = Arcfacelayer(num_classes, s, m)([x,yinput]) #outputをクラス数と同じ数に
    prediction = Activation('softmax')(s_cos)

    model = Model(inputs=[resnet50v2.input,yinput], outputs=prediction)
    
    return model

自分で試したこと

クラスの予測については，クラスの代表ベクトルとの類似度が一番高いものを取得出来たらいいと思っていますが，その代表ベクトルの取得方法も分かりません．

現在は正解クラスが分かっているデータ同士の類似度の算出は，下記の予測用モデルによって得たベクトルにより計算しております．

predict_model = Model(arcface_model.get_layer(index=0).input, arcface_model.get_layer(index=-5).output) # predict_modelの構築

このように，ArcFaceによるクラス予測の方法はネット上では見つけられなかったため，皆様のお力をお貸しいただければと思います．

0 likes

@deeplightning posted at 2023-10-05

恐らく自分と同じ勘違いをなされていると思い、自分が理解した範囲で書き込みます。
望んだ答えではない、又は間違っていたら、すみません。

ArcFaceを用いて直接クラス予測はしません。
Arcfacelayer以降を取り除いてsoftmaxレイヤーに置き換えても正常に推論できないでしょう。

＞学習を行うため，入力には[画像，正解クラス]の二つが必要です．
正解クラスはあくまでも画像の特徴と分類を学ばせる為のラベルにすぎず、
推論時には不要になります。

ご自分でも書かれている通り、
＞predict_model = Model(arcface_model.get_layer(index=0).input, arcface_model.get_layer(index=-5).output)
によって、入力を画像のみ、出力を特徴ベクトルに変換する推論モデルに組み直しています。

代表ベクトルについてですが、上記の推論モデルを用い、
代表ベクトル = predict_model(代表の画像)
で得ることができ、これを保存しておきます。

例えば、代表の画像に似ている画像を画像Ａとします。
そして、代表に似ていない未知のクラス画像を画像Ｂとします。

画像Ａのベクトル = predict_model(画像Ａ)
画像Ｂのベクトル = predict_model(画像Ｂ)
とします。

後は、ユークリッド距離又はコサイン類似度などの計算を用いて、
Cosine_Similarity(代表ベクトル, 画像Ａのベクトル)の結果が0以上で似ている
Cosine_Similarity(代表ベクトル, 画像Ｂのベクトル)の結果が0以下で似ていない
という形で判定し分類するのです。

入力する画像はスケールを合わせる必要が有ります。
自分は顔認証を行ったのですが顔同士のスケールを合わせる為、
顔の矩形を抽出し、アスペクト比を考慮したリサイズを行う必要が有りました。
"MTCNN"で検索すると情報が得られます。

もし、顔認証を行っているならば学習自体、不毛です。
多くの顔データとアノテーションが必要になり、
中小企業や個人では、コストがかかりすぎます。
"facenet_keras_weights"で検索すると有用な情報が得られます。
※顔認証以外でも事前学習済み重みデータをそのまま使用した方が良いです。

以上。

0Like

Are you sure you want to delete the question?