More than 5 years have passed since last update.

【StyleGAN２入門】10個のアニメ顔で独自学習♬

Posted at 2020-01-30

StyleGANに引き続いて、StyleGAN2でもアニメ顔の独自学習をやってみた。
結論から言うと、config-aは学習できたが、config-bからconfig-fは環境が整っていない。
とはいえ、ほぼコードは見えてきたので、一度まとめておこうと思う。
特に今回は、10個のみのアニメ顔でどこまで学習するかをやってみた。
【参考】
①NVlabs/stylegan2
②stylegan2で独自モデルの学習方法

やったこと

・環境
・10個のアニメ顔で学習
・出力してみる
・出力画像とオリジナル画像

・環境

StyleGAN2の環境は以下のとおり

Requirements
・Both Linux and Windows are supported. 
・Linux is recommended for performance and compatibility reasons.
・64-bit Python 3.6 installation. We recommend Anaconda3 with numpy 1.14.3 or newer.
・TensorFlow 1.14 or 1.15 with GPU support. The code does not support TensorFlow 2.0.
・On Windows, you need to use TensorFlow 1.14 — TensorFlow 1.15 will not work.
・One or more high-end NVIDIA GPUs, NVIDIA drivers, CUDA 10.0 toolkit and cuDNN 7.5. ・To reproduce the results reported in the paper, 
　　　　　you need an NVIDIA GPU with at least 16 GB of DRAM.
・Docker users: use the provided Dockerfile to build an image with the required library dependencies.

・StyleGAN2 relies on custom TensorFlow ops that are compiled on the fly using NVCC. 
To test that your NVCC installation is working correctly, run:
nvcc test_nvcc.cu -o test_nvcc -run
| CPU says hello.
| GPU says hello.
・On Windows, the compilation requires Microsoft Visual Studio to be in PATH. 
We recommend installing Visual Studio Community Edition and adding into PATH 
using "C:\Program Files (x86)\Microsoft
VisualStudio\2019\Community\VC\Auxiliary\Build\vcvars64.bat".

ところが、一応整ったと思ったところでconfig-a以外は、以下のエラーを吐いて止まってしまう。

 File "C:\Users\user\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 185, in __init__
    self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Tensor'

・10個のアニメ顔で学習

ということで、今回はconfig-aすなわちStyleGANで学習を試みた。
まず、準備として
⓪アニメ顔を１０枚選んで、前回と同様にcustom_dataset-r02.tfrecordsを作成する

コードは以下のとおりで、1060で動くように変更する。
コード全体は以下においた。
StyleGAN2/run_training.py /
変更な主な部分は、以下のとおり
①ターゲット解像度＝６４とした
以下のコードで読み込むとサイズ（128,128）の画像も（64，64）で読み込め、StyleGANのネットワーク構造のものとなる。

run_training.py

    dataset_args = EasyDict(tfrecord_dir=dataset, resolution=64)  #, resolution=64

②初期解像度も64とした
これはStyleGANはpGANなので学習が進むとだんだん次元が大きくなるが、今回はどうもそれがうまく動かないので、最初からターゲットな解像度の64にしてやってみたら収束しだしたのでこれを採用した

run_training.py

sched.lod_initial_resolution = 64 #8

run_training.pyのそのほかの変更は上記のコード全体を見てほしい
③これだけだとメモリーエラーを吐いて止まってしまうのでdataset.pyの以下の部分を変更します。
StyleGAN2/training/dataset.py

training/dataset.py

# Load labels.
        assert max_label_size == 'full' or max_label_size >= 0
        #self._np_labels = np.zeros([1<<30, 0], dtype=np.float32)
        self._np_labels = np.zeros([1<<20, 0], dtype=np.float32)

④最後に学習が時間かかるのとHDDのメモリーが不足気味なので毎回出力に変更
StyleGAN2/training/training_loop.py

training/training_loop.py

    image_snapshot_ticks    = 1,  #50     # How often to save image snapshots? None = only save 'reals.png' and 'fakes-init.png'.
    network_snapshot_ticks  = 1, #50      # How often to save network snapshots? None = only save 'networks-final.pkl'.
．．．
            if network_snapshot_ticks is not None and (cur_tick % network_snapshot_ticks == 0 or done):
                #pkl = dnnlib.make_run_dir_path('network-snapshot-%06d.pkl' % (cur_nimg // 1000))
                pkl = dnnlib.make_run_dir_path('network-snapshot-.pkl')
                misc.save_pkl((G, D, Gs), pkl)

・出力してみる

出力例のミキシングやってみる

# Example of style mixing (matches the corresponding video clip)
python run_generator.py style-mixing-example --network=./results/00007-stylegan2-custom_dataset-1gpu-config-a/network-final.pkl --row-seeds=85,100,75,458,1500 --col-seeds=55,821,1789,293 --truncation-psi=1.0

以下のとおり、学習できている。

また、前回のStyleGANのミキシングも以下のコードで実施できた。

そして、適当に100枚出力してみると以下のように出力した。

StyleGAN2/pretrained_example.py