More than 3 years have passed since last update.

pix2pixをWindows環境で実装（親切なコマンドライン実行結果、ありがちなエラー実例つき）

Last updated at 2020-07-06Posted at 2020-07-06

　これをWindows環境で実装する際の注意点。

まずファイルのコピー

　コマンドラインを開いて以下の命令を打ちます

command

C:\Users\hoge>git clone https://github.com/phillipi/pix2pix

result

Cloning into 'pix2pix'...
remote: Enumerating objects: 22, done.
remote: Counting objects: 100% (22/22), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 479 (delta 5), reused 8 (delta 2), pack-reused 457R
Receiving objects: 100% (479/479), 2.45 MiB | 1.02 MiB/s, done.
Resolving deltas: 100% (255/255), done.

　これでホームディレクトリ（多くの場合はC:\Users\hoge\）にプロジェクトがコピーされます。
　このプロジェクトはlua言語というので書かれているので、親切な人がpythonで実装してくれたファイルもコピーします。

command

C:\Users\hoge>git clone https://github.com/tdeboissiere/DeepLearningImplementations.git

result

Cloning into 'DeepLearningImplementations'...
remote: Enumerating objects: 1616, done.
remote: Total 1616 (delta 0), reused 0 (delta 0), pack-reused 1616 eceiving objects:  96% (1552/1616), 50.29 MiB | 1.13 Receiving objects: 100% (1616/1616), 50.34 MiB | 1.10 MiB/s, done.

Resolving deltas: 100% (754/754), done.

　これら2つのファイルをマージします。Linuxだとrsyncという便利commandがありますが、commandプロンプトだとrobocopyというcommandで同様のことができるそうです。便利！

command

robocopy /E DeepLearningImplementations\pix2pix\ pix2pix\

result

-------------------------------------------------------------------------------
   ROBOCOPY     ::     Windows の堅牢性の高いファイル コピー
-------------------------------------------------------------------------------

  開始: 2020年7月6日 18:47:00
   コピー元 : C:\Users\hoge\DeepLearningImplementations\pix2pix\
     コピー先 : C:\Users\hoge\pix2pix\

    ファイル: *.*

  オプション: *.* /S /E /DCOPY:DA /COPY:DAT /R:1000000 /W:30

------------------------------------------------------------------------------

（長いので省略）

------------------------------------------------------------------------------

                  合計     コピー済み      スキップ       不一致        失敗    Extras
   ディレクトリ:        10         7         3         0         0         5
     ファイル:        16        16         0         0         0         9
      バイト:   560.2 k   560.2 k         0         0         0    61.4 k
       時刻:   0:00:00   0:00:00                       0:00:00   0:00:00


       速度:            30196526 バイト/秒
       速度:            1727.859 MB/分
   終了: 2020年7月6日 18:47:01

データセットのDL

　データセットは個別にDLする必要があります。しかしここで問題があって、download_dataset.shというファイルを実行してデータセットをダウンロードする必要があるのですが、この形式のファイルはWindows環境では基本的には実行できません。このファイルは単なるスクリプトであり、複数のコマンドが書かれているだけなのですが、そのコマンドがLinux用であるからです。

　コマンドの中身をWindows形式に書き換える、Cygwinを入れる等色々な対策が考えられますが、ここはミニマルな解決法として、Windows上でLinux環境を再現する手段としてwslを使うことにします。wslについての説明はこちら。wslがインストールされていれば、commandライン上でwslと打てばLinux環境になります。

command

C:\Users\hoge>cd pix2pix
C:\Users\hoge\pix2pix>cd datasets
C:\Users\hoge\pix2pix\datasets>wsl
hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$

　ここでファイルを実行すれば解決！　……とはなりません

command

hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash download_dataset.sh facades

result

download_dataset.sh: line 2: $'\r': command not found
download_dataset.sh: line 17: syntax error: unexpected end of file

　エラーが発生します。これはWindowsとLinuxで改行コードが違うことによる問題です。Windows環境でgit cloneした段階で、クローンされたファイルもWindows仕様になってしまったのです！　だからWindows上のLinux環境で走らせるとエラーになる。ややこしい！

　仕方がないので、Windows上にあるファイルをLinux仕様に変更します。

command

hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ tr -d '\r' <download_dataset.sh> win2linux.sh
hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash win2linux.sh facades

しかしまたエラーが出ます。

result

Specified [facades]
win2linux.sh: line 13: wget: command not found
tar (child): ./datasets/facades.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove './datasets/facades.tar.gz': No such file or directory

　これはwgetという命令がインストールされていないためです。インストールします。めんどくさいですね。

command

hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ sudo apt install wget

result

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  liblua5.1-0-dev libreadline-dev libtinfo-dev libtool-bin lua-any lua-sec lua-socket unzip zip
Use 'sudo apt autoremove' to remove them.
The following NEW packages will be installed:
  wget
0 upgraded, 1 newly installed, 0 to remove and 81 not upgraded.
Need to get 316 kB of archives.
After this operation, 954 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 wget amd64 1.19.4-1ubuntu2.2 [316 kB]
Fetched 316 kB in 2s (190 kB/s)
Selecting previously unselected package wget.
(Reading database ... 79492 files and directories currently installed.)
Preparing to unpack .../wget_1.19.4-1ubuntu2.2_amd64.deb ...
Unpacking wget (1.19.4-1ubuntu2.2) ...
Setting up wget (1.19.4-1ubuntu2.2) ...
Processing triggers for install-info (6.5.0.dfsg.1-2) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

　もう一度実行

command

hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash win2linux.sh facades

result

Specified [facades]
WARNING: timestamping does nothing in combination with -O. See the manual
for details.

--2020-07-06 19:33:36--  http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/facades.tar.gz
Resolving efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)... 128.32.189.73
Connecting to efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)|128.32.189.73|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30168306 (29M) [application/x-gzip]
Saving to: ‘./datasets/facades.tar.gz’

./datasets/facades.tar.gz     100%[=================================================>]  28.77M  1.13MB/s    in 26s
（長いので省略）

これさえ終わればLinux環境は用済みです。抜け出しましょう。

command

hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ exit

result

logout

C:\Users\hoge\pix2pix\datasets>

データセットの処理

　いくつかデータセットの処理が必要です。ここからはpythonを使っていきます。pythonはインストールされていますがバニラの想定です。ここではAnacondaで新たにpixという環境を用意して再現しています。

command

C:\Users\hoge\pix2pix\datasets>cd ..
C:\Users\hoge\pix2pix>conda create -n pix python=3.6
（途中省略）
(pix) C:\Users\hoge\pix2pix>

　色々ライブラリが必要なので、インストールしていきます。

command

(pix) C:\Users\hoge\pix2pix>conda install numpy
(pix) C:\Users\hoge\pix2pix>conda install keras
(pix) C:\Users\hoge\pix2pix>conda install -c conda-forge parmap
(pix) C:\Users\hoge\pix2pix>conda install matplotlib
(pix) C:\Users\hoge\pix2pix>conda install tqdm
(pix) C:\Users\hoge\pix2pix>conda install opencv
(pix) C:\Users\hoge\pix2pix>conda install h5py
(pix) C:\Users\hoge\pix2pix>conda install tensorflow-gpu

　parmapとopencvは少し注意が必要です。parmapはcondaではインストールできないので上のコマンドが必要です。ここで焦ってpipでインストールするとAnacondaの環境がめちゃくちゃになったりするらしいです。私は経験がありませんが……。OpenCVはpipではopencv-pythonですがcondaだとopencvでよいらしいです。

command

(pix) C:\Users\hoge\pix2pix\src\data>python make_dataset.py ../../datasets/datasets/facades/ 3 --img_size 256

result

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.22it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.12it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.25it/s]

いざ学習

command

(pix) C:\Users\hoge\pix2pix\src\data>cd ..
(pix) C:\Users\hoge\pix2pix\src>cd model
(pix) C:\Users\hoge\pix2pix\src\model>python main.py 64 64 --backend tensorflow --nb_epoch 10

result

Using TensorFlow backend.
2020-07-06 20:02:48.456473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.764917: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-06 20:02:51.835745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s
2020-07-06 20:02:51.842601: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.848476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-06 20:02:51.854786: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-06 20:02:51.859952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-06 20:02:51.867498: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-06 20:02:51.874415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-06 20:02:51.884355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-06 20:02:51.887931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-06 20:02:51.890744: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-07-06 20:02:51.895380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s
2020-07-06 20:02:51.901639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.904406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-06 20:02:51.908518: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-06 20:02:51.911282: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-06 20:02:51.915043: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-06 20:02:51.918430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-06 20:02:51.921086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-06 20:02:51.924511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-06 20:02:52.470406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-06 20:02:52.474907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-07-06 20:02:52.477401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-07-06 20:02:52.480235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6267 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
Traceback (most recent call last):
  File "main.py", line 73, in <module>
    launch_training(**d_params)
  File "main.py", line 8, in launch_training
    train.train(**kwargs)
  File "C:\Users\hoge\pix2pix\src\model\train.py", line 72, in train
    do_plot)
  File "C:\Users\hoge\pix2pix\src\model\models.py", line 310, in load
    model = generator_unet_upsampling(img_dim, bn_mode, model_name=model_name)
  File "C:\Users\hoge\pix2pix\src\model\models.py", line 92, in generator_unet_upsampling
    if K.image_dim_ordering() == "channels_first":
AttributeError: module 'keras.backend' has no attribute 'image_dim_ordering'

　怒られました。実はこれはkerasのバージョンの違いによるもので、新しいkerasだとapi（命令系統）の一部が違っているのです。トラップ！　実はこうなることがわかっていて敢えてバージョン指定をせずにインストールしたのですが、例えばGoogle Colaboratoryだと最新のkerasやtensorflowがビルトインされているので、知らずにこのトラップを踏む人がいたら可哀想だなと思って実演（？）しました。余談ですが、tensorflowでcontribという命令がないぞと怒られる時もバージョンダウンでなんとかなるっぽいです。

　という訳でバージョンダウンします。

command

(pix) C:\Users\hoge\pix2pix\src\model>conda install keras==2.0.8

result

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: C:\Users\hoge\Anaconda3\envs\pix

  added / updated specs:
    - keras==2.0.8


The following packages will be REMOVED:

  keras-applications-1.0.8-py_0
  keras-base-2.3.1-py36_0
  keras-preprocessing-1.1.0-py_1
  tensorflow-estimator-2.1.0-pyhd54b08b_0

The following packages will be SUPERSEDED by a higher-priority channel:

  tensorboard        pkgs/main/noarch::tensorboard-2.2.1-p~ --> pkgs/main/win-64::tensorboard-1.10.0-py36he025d50_0

The following packages will be DOWNGRADED:

  cudatoolkit                           10.1.243-h74a9793_0 --> 9.0-1
  cudnn                                    7.6.5-cuda10.1_0 --> 7.6.5-cuda9.0_0
  keras                                             2.3.1-0 --> 2.0.8-py36h65e7a35_0
  tensorflow                       2.1.0-gpu_py36h3346743_0 --> 1.10.0-gpu_py36h3514669_0
  tensorflow-base                  2.1.0-gpu_py36h55f5790_0 --> 1.10.0-gpu_py36h6e53903_0
  tensorflow-gpu                           2.1.0-h0d30ee6_0 --> 1.10.0-hf154084_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

　関連するtensorflow（それもgpu対応！）のバージョン合わせまでしっかりやってくれていることがわかります。では改めて実行。

command

(pix) C:\Users\hoge\pix2pix\src\model>python main.py 64 64 --backend tensorflow --nb_epoch 10

result

（中略）
Start training
2020-07-06 20:11:14.114015: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-07-06 20:11:14.256531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.55GiB
2020-07-06 20:11:14.264273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2020-07-06 20:11:14.649123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-06 20:11:14.653675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2020-07-06 20:11:14.656771: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2020-07-06 20:11:14.660571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6286 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
396/400 [============================>.] - ETA: 0s - D logloss: 0.7649 - G tot: 12.9429 - G L1: 1.1987 - G logloss: 0.9557
Epoch 1/10, Time: 85.63616681098938
396/400 [============================>.] - ETA: 0s - D logloss: 0.7681 - G tot: 12.1516 - G L1: 1.1377 - G logloss: 0.7749
Epoch 2/10, Time: 80.52165269851685
396/400 [============================>.] - ETA: 0s - D logloss: 0.7439 - G tot: 11.6468 - G L1: 1.0932 - G logloss: 0.7151
Epoch 3/10, Time: 28.542045831680298
396/400 [============================>.] - ETA: 0s - D logloss: 0.7270 - G tot: 11.7284 - G L1: 1.0991 - G logloss: 0.7370
Epoch 4/10, Time: 28.93921661376953
396/400 [============================>.] - ETA: 0s - D logloss: 0.7150 - G tot: 11.4890 - G L1: 1.0733 - G logloss: 0.7556
Epoch 5/10, Time: 28.971989393234253
396/400 [============================>.] - ETA: 0s - D logloss: 0.7448 - G tot: 11.6123 - G L1: 1.0889 - G logloss: 0.7231
Epoch 6/10, Time: 29.158592700958252
396/400 [============================>.] - ETA: 0s - D logloss: 0.7278 - G tot: 11.3660 - G L1: 1.0649 - G logloss: 0.7168
Epoch 7/10, Time: 28.717032432556152
396/400 [============================>.] - ETA: 0s - D logloss: 0.7282 - G tot: 11.4358 - G L1: 1.0718 - G logloss: 0.7182
Epoch 8/10, Time: 29.007369995117188
396/400 [============================>.] - ETA: 0s - D logloss: 0.7159 - G tot: 11.3449 - G L1: 1.0643 - G logloss: 0.7019
Epoch 9/10, Time: 29.249063730239868
396/400 [============================>.] - ETA: 0s - D logloss: 0.7063 - G tot: 10.9784 - G L1: 1.0294 - G logloss: 0.6847
Epoch 10/10, Time: 29.128608226776123

　無事にサンプルを走らせることができました。やったね！

追記

　この後、別のパソコンで即席で環境を作ろうとしたのですが、同じ手順でやると、GPUを認識はするのですが学習の直前にCUDNN_STATUS_ALLOC_FAILEDなるエラーが出てきて止まるようになりました。結論的にはNVIDIAからGPUドライバを最新にすると直りました。

参考になったページ（感謝）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up