注意事項
- proxmoxのバージョンは8.2.2
- 非特権LXC内でGPUを認識させる方法はこちらの記事を参照
モチベーション
- 複数の非特権LXCでGPU込みのDockerを利用したい
- 最近はAIの発展によって、GPUの利用を前提とするサービスが増えてきてる
- dockerファイルが提供されている場合も多い
非特権LXC内でGPUが認識していることを確認
nvidia-smi
# +-----------------------------------------------------------------------------------------+
# | NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
# |-----------------------------------------+------------------------+----------------------+
# | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
# | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
# | | | MIG M. |
# |=========================================+========================+======================|
# | 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off |
# | 30% 51C P0 N/A / 450W | 1MiB / 24564MiB | 0% Default |
# | | | N/A |
# +-----------------------------------------+------------------------+----------------------+
# +-----------------------------------------------------------------------------------------+
# | Processes: |
# | GPU GI CI PID Type Process name GPU Memory |
# | ID ID Usage |
# |=========================================================================================|
# | No running processes found |
# +-----------------------------------------------------------------------------------------+
Dockerのインストール
- 最新のインストール方法は公式を参照
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
# To install the latest version, run:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# 権限追加
sudo groupadd docker
sudo gpasswd -a $USER docker
sudo service docker restart
newgrp docker
# Hello World
docker run hello-world
NVIDIA Container Toolkit のインストール
- 最新のインストール方法は公式を参照
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
NVIDIA Container Toolkitの設定を変更
sudo vi /etc/nvidia-container-runtime/config.toml
# 以下のように変更
[-] #no-cgroups = false
[+] no-cgroups = true
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
動作確認
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
# +-----------------------------------------------------------------------------------------+
# | NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
# |-----------------------------------------+------------------------+----------------------+
# | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
# | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
# | | | MIG M. |
# |=========================================+========================+======================|
# | 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off |
# | 30% 51C P0 N/A / 450W | 1MiB / 24564MiB | 0% Default |
# | | | N/A |
# +-----------------------------------------+------------------------+----------------------+
# +-----------------------------------------------------------------------------------------+
# | Processes: |
# | GPU GI CI PID Type Process name GPU Memory |
# | ID ID Usage |
# |=========================================================================================|
# | No running processes found |
# +-----------------------------------------------------------------------------------------+