1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Debian11.0 (bullseye) にGPU利用できる形でdockerを入れたときのメモ

Last updated at Posted at 2021-08-20

まとめ

  • Debian11.0 (bullseye) でdockerに入れたpytorch(1.8.1+cu111)でGPU利用できるかためした
  • 手持ちのRTX3060の場合、ホスト側のドライバを460.x系にしないとエラーが出て動かなかった
  • あといろいろエラーの対処が必要

ホスト環境

  • Debian 11.0 bullseye
  • Nvidia RTX 3060
    • Driver Version: 465.31 -> 460.84 (あとから変更)
    • CUDA Version: 11.3 -> 11.2 (あとから変更)

dockerのインストール

gpu利用可能にするまで

デバイスを認識させるまで

起動:

$ docker run -it --rm --gpus all -v `pwd`:"/home/jovyan/work" -p 8888:8888 jupyter/scipy-notebook
!pip install transformers==4.5.0 fugashi==1.1.0 ipadic==1.0.0 torch==1.8.1
  • 起動できなくて、このエラーがでてきたので対処した:
Could not select device driver "" with capabilities: [[gpu]].
distribution="debian10"
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

デバイスが見つからない問題の対処(どのタイミングでやったか忘れた)

エラー:

cgroup subsystem devices not found:
$ sudo nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet systemd.unified_cgroup_hierarchy=0"

$ update-grub
# shutdown -r now

nvidia-smiを呼び出せるようにするまで

起動:

$ docker run -it --rm --gpus all -v `pwd`:"/home/jovyan/work" -p 8888:8888 jupyter/scipy-notebook

bert.cuda()のところで怒られて、まだGPUが使えない:

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

再度起動:

$ docker run -it --rm --gpus all -v `pwd`:"/home/jovyan/work" -p 8888:8888 -e GRANT_SUDO=yes --user root jupyter/scipy-notebook

jupyterのmagicで一番先頭に追加して確認:

%%bash
sudo ldconfig
nvidia-smi

nvidia-smiはうまく行ったがまた別のエラー:

/opt/conda/lib/python3.9/site-packages/torch/cuda/__init__.py:104: UserWarning: 
NVIDIA GeForce RTX 3060 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3060 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

jupyterの色々入れてるところで、cuda11.1使ってるtorchを入れ直す:

!pip uninstall -y torch
!pip install transformers==4.5.0 fugashi==1.1.0 ipadic==1.0.0 torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html

ちょっと先のところでまたエラー:

RuntimeError: CUDA error: no kernel image is available for execution on the device
!conda install -y pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge

ドライバーのバージョンを下げる

Driver Version: 460.84       CUDA Version: 11.2
  • pip
  • 結局これでインストールして、最後まで通ったよう
!pip install transformers==4.5.0 fugashi==1.1.0 ipadic==1.0.0 torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?