過去の記事
PC上に学習環境を直接構築する方法は以下を参照
修正中
前提条件
ローカルマシンにGPUの環境が構築されている前提とします.
(nvidia-smiでGPUが確認できる)
Dockerのインストールは,以下のサイトなどを参考に...
https://qiita.com/ttsubo/items/c97173e1f04db3cbaeda
Dockerの基本操作
以下参照:
https://qiita.com/19503/private/a6f0d43c8cfc44748f62
Dockerコンテナを用いた環境構築
実行マシンのGPUやCUDAの世代の影響を避けるため,実行環境のGPUを直接操作するのではなく,以下の様にDockerコンテナを利用した方法を推奨します.
# 1.ホストPC上の操作
# setup COCO data(coco2017)
COCO_DIR=/home/your_work_dir/your_coco_data_dir
mkdir $COCO_DIR/images
cd $COCO_DIR/images
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
cd $COCO_DIR
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
cd /home/your_work_dir
git clone https://github.com/pytorch/pytorch.git -b v0.2.0
cd pytorch
# Dockerfileを,後述するlast-one-Docerfileの様に編集(名前はDockerfileから変更しない事)
docker build -t last-one:v0.01 -f ./Dockerfile .
# --ipc="host"はホストPCとコンテナ間でメモリを共有するオプション,これが無いとメモリ不足により実行中に落ちる(以下参考)
# https://discuss.pytorch.org/t/unable-to-write-to-file-torch-18692-1954506624/9990
docker run --gpus all --ipc="host" -v /home/your_work_dir/your_coco_data_dir:/home/dat -itd --name last-one_cont last-one:v0.01
docker exec -it last-one_cont /bin/bash
# 2.Dockerコンテナでの操作
pip intall pycocotools
cd /home/dev/Pytorch_Realtime_Multi-Person_Pose_Estimation/preprocessing
COCO_DIR=/home/dat
apt-get update
pip install numpy==1.16.6 opencv-python==3.1.0.0 easydict==1.9
python generate_json_mask.py --ann_path $COCO_DIR/annotations/person_keypoints_train2017.json --json_path out/train2017_json.json --mask_dir out/train2017_maskdir --filelist_path out/train2017_filelst.txt --masklist_path out/train2017_masklst.txt
python generate_json_mask.py --ann_path $COCO_DIR/annotations/person_keypoints_val2017.json --json_path out/val2017_json.json --mask_dir out/val2017_maskdir --filelist_path out/val2017_filelst.txt --masklist_path out/val2017_masklst.txt
# if you encountered "import _tkinter # If this fails your Python may not be configured for Tk" error, you must install tk-dev as below
# apt-get update
# apt-get install tk-dev
# 関連ファイルのパスをviエディタで編集(以下例)
vi train2017_filelst.txt
%s/^/\/home\/dat\/images\/train2017\//g
vi val2017_filelst.txt
%s/^/\/home\/dat\/images\/val2017\//g
vi train2017_masklst.txt
%s/^/\/home\/dev\/Pytorch_Realtime_Multi-Person_Pose_Estimation\/preprocessing\//g
vi val2017_masklst.txt
%s/^/\/home\/dev\/Pytorch_Realtime_Multi-Person_Pose_Estimation\/preprocessing\//g
# --gpuオプションは自身のGPU環境によって変更が必要,かつ過去記事にしたがってtrain/train_pose.py l255のCUDA_VISIBLE_DEVICESを修正
TRAIN_DIR=/home/dev/Pytorch_Realtime_Multi-Person_Pose_Estimation/preprocessing/out
python train_pose.py --gpu 0 1 2 --train_dir $TRAIN_DIR/train2017_filelst.txt $TRAIN_DIR/train2017_masklst.txt $TRAIN_DIR/train2017_json.json --val_dir $TRAIN_DIR/val2017_filelst.txt $TRAIN_DIR/val2017_masklst.txt $TRAIN_DIR/val2017_json.json --config config.yml > $LOG
last-one-Docerfile
FROM nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04
RUN echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
cmake \
git \
wget \
vim \
ca-certificates \
libjpeg-dev \
libpng-dev &&\
rm -rf /var/lib/apt/lists/*
RUN wget https://repo.anaconda.com/archive/Anaconda2-2.4.0-Linux-x86_64.sh -O ./anaconda.sh && \
chmod +x ./anaconda.sh && \
./anaconda.sh -b -p /opt/conda && \
rm ./anaconda.sh && \
/opt/conda/bin/conda install conda-build && \
/opt/conda/bin/conda create -y --name pytorch-py27 python=2.7.13 numpy pyyaml scipy ipython mkl&& \
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/envs/pytorch-py27/bin:$PATH
RUN conda install --name pytorch-py27 -c soumith magma-cuda80
# This must be done before pip so that requirements.txt is available
WORKDIR /opt/pytorch
COPY . .
# install torch
RUN TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1+PTX" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" \
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" \
pip install -v .
# install torchvision
RUN git clone https://github.com/pytorch/vision.git -b v0.2.0
WORKDIR /opt/pytorch/vision
RUN pip install -v .
# clone OpenPose code
WORKDIR /home/dev
RUN git clone https://github.com/last-one/Pytorch_Realtime_Multi-Person_Pose_Estimation.git
# setup coco tools
RUN git clone https://github.com/cocodataset/cocoapi.git
RUN pip install pandas
WORKDIR /workspace