More than 1 year has passed since last update.

MMCVのインストール方法とViTPoseの動かし方

Posted at 2024-04-11

概要

ViTPoseというモデルを動かそうとした際にMMCVのインストールに苦労したのでインストール方法をまとめました．

ViTPoseとは，姿勢推定モデルであり，画像認識タスクで優れた性能を示すvision transformerを，姿勢推定に用いたモデル．

Yufei Xu, Jing Zhang, Qiming ZHANG, Dacheng Tao, ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation, NeurIPS2022.
- Poster & Video: https://neurips.cc/virtual/2022/poster/55265
- OpenReview: https://openreview.net/forum?id=6H2pBoPtm0s
- Proceedings: https://proceedings.neurips.cc/paper_files/paper/2022/hash/fbb10d319d44f8c3b4720873e4177c65-Abstract-Conference.html

MMCVとは．OpenMMLabsのほとんどのプロジェクト（リポジトリ）で使用される基本的なライブラリ．
今回はOpenMMLab の構成物であり，２次元の姿勢推定，３次元の姿勢推定の機能を提供するmmposeを動かすために使用しました．

インストール方法の目次

その１：mimを使ってインストールする（mmcv推奨）
その２：pipを使ってインストールする
その３：githubからリポジトリをクローンしてからインストールする
おまけ：ViTPoseを動かすときに発生したエラーの解消方法

注意

mmcvはバージョンによって名称が少し異なります，要求されているバージョンをしっかりと確認してインストールしましょう．

バージョン	パッケージ名（包括版）	パッケージ名（ライト版）
2.x	mmcv	mmcv-lite
1.x	mmcv-full	mmcv

包括版とライト板は両方インストールしてはいけません．
エラーが出る可能性があります．

その１：openmimを使ったmim install（mmcv推奨）

mmcv推奨のインストール方法です．
https://mmcv.readthedocs.io/en/latest/get_started/installation.html
このリンクにその２のpip installの方法も載っているのでこの記事の上から順にインストールを試すといいと思います．
私は，どちらの方法でもバージョン指定でインストール成功せず，結局その３のgithubからリポジトリをクローンしてインストールして成功しました．

mimは OpenMMLab プロジェクトのパッケージ管理ツールで、mmcv を簡単にインストールできます。
https://mmcv.readthedocs.io/en/latest/get_started/installation.html

pip install -U openmim
mim install mmcv

特定バージョンを指定する場合は以下のコマンドを使用します．（mmcv-fullのバージョン1.7.2をインストールする場合）

mim install mmcv-full==1.7.2

その２：pipを使ってインストールする

pipを使ってインストールする場合，CUDAとPyTorchのバージョンの情報を使います．
CUDAとPyTorchのバージョンを確認するには，次のコマンドを使用します．

python -c 'import torch;print(torch.__version__);print(torch.version.cuda)'

バージョンを確認できたらシステムの種類、CUDAのバージョン、PyTorchのバージョン、MMCVのバージョンに応じて、適切なインストールコマンドを選択する。
https://mmcv.readthedocs.io/en/latest/get_started/installation.html

Linux, cuda12.1, torch2.1x, mmcv2.1.0 の場合

pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1/index.html

このインストール方法はリンク先で確認したほうがわかりやすいと思います．

その３：githubからリポジトリをクローンしてからインストールする

私は唯一この方法を使ってバージョン指定でインストール成功しました．
まずmmcvのリポジトリをクローンします．
https://github.com/open-mmlab/mmcv

git clone https://github.com/open-mmlab/mmcv.git

リポジトリに移動して，指定したいバージョンにブランチを移動します．
今回はバージョン1.7.2に移動します．

cd mmcv
git checkout v1.7.2

最後にリポジトリ上でpip installします．

MMCV_WITH_OPS=1 pip install -e .

この時MMCV_WITH_OPSの値が1のとき包括版，0のときライト版をインストールします．

sudoなどを使ってインストールする場合，MMCV_WITH_OPSが認識されず包括版かライト版かを選択できなくなります．
その場合は，setup.py内のsetup()の中のname=の後の部分を自分のインストールしたい方の名前に編集してから
pip install -e .を実行します．

mmcv/setup.py（編集後の例）

setup(
    name='mmcv-lite',
    version=get_version(),
    description='OpenMMLab Computer Vision Foundation',
    ...

おまけ：ViTPoseを動かすときに発生したエラーの解消方法

1. unrecognized arguments: --local-rank=0 の場合

以下のようなエラーが出た場合

エラーコード

/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py:183: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects --local-rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
/mnt/HDD4TB-3/takama/mmcv/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module.
warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
usage: train.py [-h] [--work-dir WORK_DIR] [--resume-from RESUME_FROM] [--no-validate]
[--gpus GPUS | --gpu-ids GPU_IDS [GPU_IDS ...] | --gpu-id GPU_ID]
[--seed SEED] [--deterministic]
[--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]]
[--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK]
[--autoscale-lr]
config
train.py: error: unrecognized arguments: --local-rank=0
[2024-03-19 17:06:24,948] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 2) local_rank: 0 (pid: 30306) of binary: /usr/bin/python
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 198, in <module>
main()
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 194, in main
launch(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launch.py", line 179, in launch
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 135, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
tools/train.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-03-19_17:06:24
host : b7ddb354ef6f
rank : 0 (local_rank: 0)
exitcode : 2 (pid: 30306)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

には．．．．

dist_train.shのpython -m torch.distributed.launchをtorchrunに変更すればエラーが出なくなる

ViTPose/tools/dist_train.sh

#!/usr/bin/env bash
# Copyright (c) OpenMMLab. All rights reserved.

CONFIG=$1
GPUS=$2
PORT=${PORT:-29500}

PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT \
    $(dirname "$0")/train.py $CONFIG --launcher pytorch ${@:3}

2. TypeError: FormatCode() got an unexpected keyword argument 'verify' の場合

以下のようなエラーが出た場合

エラーコード

/mnt/HDD4TB-3/takama/mmcv/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module. 
  warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
/mnt/HDD4TB-3/takama/ViTPose/mmpose/utils/setup_env.py:32: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
/mnt/HDD4TB-3/takama/ViTPose/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  warnings.warn(
2024-03-28 20:29:08,479 - mmpose - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
CUDA available: True
GPU 0,4,5,6,7: NVIDIA GeForce RTX 2080 Ti
GPU 1: Quadro RTX 5000
GPU 2,3: Quadro GV100
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_12.3.r12.3/compiler.33567101_0
GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.2.0+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

TorchVision: 0.17.0+cu121
OpenCV: 4.5.5
MMCV: 1.3.9
MMCV Compiler: n/a
MMCV CUDA Compiler: n/a
MMPose: 0.24.0+d521645
------------------------------------------------------------

2024-03-28 20:29:08,480 - mmpose - INFO - Distributed training: True
Traceback (most recent call last):
  File "/mnt/HDD4TB-3/takama/ViTPose/tools/train.py", line 195, in <module>
    main()
  File "/mnt/HDD4TB-3/takama/ViTPose/tools/train.py", line 159, in main
    logger.info(f'Config:\n{cfg.pretty_text}')
  File "/mnt/HDD4TB-3/takama/mmcv/mmcv/utils/config.py", line 479, in pretty_text
    text, _ = FormatCode(text, style_config=yapf_style, verify=True)
TypeError: FormatCode() got an unexpected keyword argument 'verify'
[2024-03-28 20:29:10,076] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 112291) of binary: /usr/bin/python
Traceback (most recent call last):
  File "/usr/local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 812, in main
    run(args)
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 803, in run
    elastic_launch(
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 135, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
tools/train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-03-28_20:29:10
  host      : 8c42870c567a
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 112291)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

には・・・
mmcvのバージョンを1.7.2にするとエラーが出なくなる．
ただし，mmpsoeなどのmm~と互換性がなくなるためmmposeなどのディレクトリの下にある__init__.py内のmmcv_maximum_versionを1.7.2以上にしておく

mmpose/__init__.py

# Copyright (c) OpenMMLab. All rights reserved.
import mmcv
import mmengine
from mmengine.utils import digit_version

from .version import __version__, short_version

mmcv_minimum_version = '1.3.0'
mmcv_maximum_version = '1.7.2'
mmcv_version = digit_version(mmcv.__version__)

mmengine_minimum_version = '0.6.0'
mmengine_maximum_version = '1.0.0'
mmengine_version = digit_version(mmengine.__version__)

assert (mmcv_version >= digit_version(mmcv_minimum_version)
        and mmcv_version <= digit_version(mmcv_maximum_version)), \
    f'MMCV=={mmcv.__version__} is used but incompatible. ' \
    f'Please install mmcv>={mmcv_minimum_version}, <={mmcv_maximum_version}.'

assert (mmengine_version >= digit_version(mmengine_minimum_version)
        and mmengine_version <= digit_version(mmengine_maximum_version)), \
    f'MMEngine=={mmengine.__version__} is used but incompatible. ' \
    f'Please install mmengine>={mmengine_minimum_version}, ' \
    f'<={mmengine_maximum_version}.'

__all__ = ['__version__', 'short_version']

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up