StyleGANの2方向マッピング

Last updated at 2019-03-01Posted at 2019-03-01

Deep Learning初心者故、間違いがありましたら指摘いただけると幸いです。

今更だけどStyleGANて何

StyleGAN「写真が証拠になる時代は終わった。」に全てが書いてあります（丸投げ）

原文を読みたい数奇な方はA Style-Based Generator Architecture for Generative Adversarial NetworksにPDFが置いてありますので、どうぞ

よく見るやつ

$A$と$B$を混ぜ合わせるといい感じに$AB$の中間画像を表示するぞ！

3個混ぜたいんだけど

ggっても中々出てこない上に、にこやかに笑った人達の顔ばっかり出てくる
俺はキャラの顔でやりたい

じゃけん実装しましょうね

手軽に試してみたかったので、今回はこちらのColaboratory Bookをお手本に学習済みのモデルを利用させて頂きました

2方向のマッピングをするには？？？

先ず二つの画像を取り出しているコードを読んでみる

# generate images code interpolated across one of the 512 dimensions
init_latent = np.random.RandomState(seed).randn(1, Gs.input_shape[1])[0]

def apply_latent_fudge(fudge):
  copy = np.copy(init_latent)
  copy[interpolate_dim] += fudge
  return copy
  
interpolate = np.linspace(0., 30., 10) - 15
latents = np.array(list(map(apply_latent_fudge, interpolate)))

ノイズであるinit_latentに対してinterpolateを与えることで2点間の変化を取り出している気がする
多分こんなイメージ

N = \begin{bmatrix}
R_0\dots R_d
\end{bmatrix} \\
I = \begin{bmatrix}
J \dots K
\end{bmatrix} \\
L = \begin{bmatrix}
(N + I_0) \dots (N + I_n)
\end{bmatrix}

$N$ := Noise, $I$ := Interpolate, $J,K \in \mathbb{Z^+}$, $R_d$ := Random Value

なんでInterpolateをノイズに与えることで別の画像へ近づくのかがいまいち理解できていないが
これを2方向にするとなると多分こうなる

N_1 = \begin{bmatrix}
R_0\dots R_d
\end{bmatrix} \\
N_2 = \begin{bmatrix}
R'_0\dots R'_d
\end{bmatrix} \\
I = \begin{bmatrix}
J \dots K
\end{bmatrix} \\
L = \begin{bmatrix}
(N + I_0) \dots (N + I_n)
\end{bmatrix} \\
L = \begin{bmatrix}
L_1 \dots L_2
\end{bmatrix}

単純にInterpolateを加えたノイズを二つ用意して、線形補間すればいいんじゃなかろうか的な思考

一応できた

def apply_latent_fudge(fudge, target_latent):
  copy = np.copy(target_latent)
  copy[interpolate_dim] += fudge
  return copy


def gen_interpolate():
  return np.linspace(0, 30, width)
  
interpolate = gen_interpolate()
applier1 = lambda f: apply_latent_fudge(f, init_latent1)
latent1 = np.array(list(map(applier1, interpolate)))
applier2 = lambda f: apply_latent_fudge(f, init_latent2)
latent2 = np.array(list(map(applier2, interpolate)))
def v_linspace(i):
  v = np.array(list(map(lambda l: np.linspace(latent1[i][l], latent2[i][l], height), range(latent1.shape[1]))))
  v = v.transpose()
  return v
combined_latent = np.array(list(map(v_linspace, range(width))))
combined_latent = np.array(list(map(lambda x: x.flatten(), combined_latent.transpose()))).transpose()

images = Gs.run(combined_latent, None, **synthesis_kwargs)

StyleGAN 2way mapping (Google Colab)
出来たんだけど、理解しきれていないのでもやもやする
あと2wayじゃなくて4頂点間の補間な気がしなくもない

もし宜しければ指摘いただけると幸いです

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up