More than 5 years have passed since last update.

Deep Photo Style Transferを低予算な環境で無理やり動かす

Last updated at 2017-03-31Posted at 2017-03-31

2017年3月下旬、突如として現れた新たな画風変換に関する論文がその品質の良さで話題を呼びました。

これは是非自分でも試してみたい。そう思いました。しかし、私が試す上で、大きな問題が立ちはだかりました。

問題 : 再現環境の調達費用が高い

一言で言うと、公式の再現環境の調達費用が高いです(Matlab持っていない、GPUはGeForce 1050Ti 4GB)。

Octaveを使うとうまく行かない

本家のコードはMatlabを前提にしているので(Octaveでテストされていない??)、公式手順通り進めるとLuaスクリプトの実行開始地点で死にます。Matlab高いです。

Octaveで前処理をすることによってLuaスクリプトが落ちる例

gpu, idx = 	0	1	
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
loading matting laplacian...	gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat	
File could not be opened: gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat	
/home/kuni/work/misc/torch/install/bin/luajit: deepmatting_seg.lua.org:119: attempt to index a nil value
stack traceback:
	deepmatting_seg.lua.org:119: in function 'main'
	deepmatting_seg.lua.org:606: in main chunk
	[C]: in function 'dofile'
	...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

モデルがGPUのメモリに乗らない

また、このモデルは4GBのGPUに乗りません。よって、高価なGPUで回す必要があります。GPU高いです。また、TorchはCPUとGPUの簡単な切り替え手段を持ちません(構造化手段がないため実装者に委ねられる)。更に、この論文の著者は切り替えのための構造化を後回しにしているようです。

メモリ溢れエラーの例

gpu, idx = 	0	1	
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
loading matting laplacian...	gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat	
Exp serial:	examples/final_results	
Setting up style layer  	2	:	relu1_1	
Setting up style layer  	7	:	relu2_1	
Setting up style layer  	12	:	relu3_1	
THCudaCheck FAIL file=/home/kuni/work/misc/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/kuni/work/misc/torch/install/bin/luajit: ...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 11 module of nn.Sequential:
...e/kuni/work/misc/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /home/kuni/work/misc/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
	[C]: in function 'v'
	...e/kuni/work/misc/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
	...sc/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...sc/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
	[C]: in function 'xpcall'
	...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../work/misc/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	deepmatting_seg.lua.org:162: in function 'main'
	deepmatting_seg.lua.org:606: in main chunk
	[C]: in function 'dofile'
	...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	.../work/misc/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	deepmatting_seg.lua.org:162: in function 'main'
	deepmatting_seg.lua.org:606: in main chunk
	[C]: in function 'dofile'
	...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

なので、低予算な環境で行えるように、下記2点を解決する無理やりな改造をしました。

レポジトリ

なお、前提としてOctave、Lua、Torchの動作環境は構築済みとします(私の環境はUbuntu 16.04 Xenial Xerus)。

改変概要

Matlabの代わりにOctaveを使う (差分)
ネットワークをメインメモリに乗せてCPUで計算する(ただし一部GPUのまま) (差分)

1. Matlabの代わりにOctaveを使う

差分実装

入力画像のラプラシアン演算を前処理としてMatlabかOctaveで行う必要がありますが、Octaveで行うには2点対応が必要でした

im2doubleを有効にするためにpkg load imageを記述する
生成されるファイルがOctave標準(matlab非互換)となってしまうので、形式を強制指定(-mat-binary)する

2. ネットワークをメインメモリに乗せてCPUで計算する(ただし一部GPUのまま)

差分実装

CudaTensorな場所をひたすらFloatTensorを使うように改造します。ただし、cuda_utilsだけは依存が深いのでGPU実装のままで行うことにしました。
改造のポイントは、型を矛盾なく置換することです(機械的な作業)。

結果

入力画像(改変対象)

入力画像(スタイル画像)

演算結果

自前の写真をexampleの連番の最後に連ねて61番とし、実行してみました。見事に画風変換ができています。

iteration : 100

iteration : 500

iteration : 1000

まとめ

美しい画風変換を低予算に追試することができました。
GPUを買うにも、クラウドでGPUクラスタを借りるにも、まだまだ庶民にとっては高いです。ちょっと試すだけならCPUでぶん回し、時間で解決したいものです。TorchにもChainerやTensorFlowのようなCPUとGPUを気軽に切り替える機構(もしくは規約)が搭載されると嬉しいです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up