More than 5 years have passed since last update.

GPU環境のセットアップ

Last updated at 2018-07-06Posted at 2018-06-29

業務でGPUサーバのセットアップを行なっていて色々詰まっているのでメモをまとめていこうと思う。
NVIDIA-DRIVERのインストールが終わってnvidai-dockerも動いたのでchainerを動かそうとしたらcuDNN関連で詰まってしまったのでその時のメモ。

環境情報

Ubuntu 16.04.4 LTS
nvidia-driver 390.67
CUDA 9.1
Python3.5.2

つまったところ

chainerでGPUを使って学習をしようとすると以下のエラーが出た。
CUDNN_STATUS_NOT_INITIALIZED

解決方法

cupyのインストールをミスっていたようである。
もともと以下の方法でcupyをインストールしていた。
pip install cupy
これを改めて以下の方法でインストールするとエラーが出なくなった。
pip install cupy-cuda91
もちろん改めてインストールする際は既存のcupyをアンインストールしてから行なった。
pip uninstall cupy

考察

chainerの公式サイトの以下のNoteを参考に解決した。pipでcupyを入れる時はcupyではなくcupy-cudaXXでCUDAのバージョンにあったものを入れないとダメみたい。

Note. If you are using a wheel, cupy shall be replaced with cupy-cudaXX (where XX is a CUDA version number).
https://docs-cupy.chainer.org/en/latest/install.html#run-cupy-with-docker

pip install cupy-cudaXXでインストールすると最新版のcuDNNとNCCLがインストールされる。
pip install cupyでcupyをいれるとcuDNNとNCCLがインストールされないので手動でインストールする必要がある。
cuDNNとNCCLのバージョンを個別に選びたい場合はpip install cupyを使う。

cuDNNがインストールされていなかったから上記のエラーが出ていたみたい。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up