LoginSignup
3
6

More than 5 years have passed since last update.

MacBook Pro にて、tensorflow r1.10(r1.11) を CUDA 10.0 仕様でbuild

Last updated at Posted at 2018-10-05

環境

  • MacBook Pro (15-inch, 2016)
  • macOS High Sierra 10.13.6(17G65)
  • GeForce GTX 1080 (外付VGA BOX GV-N1080IXEB-8GD)
  • NVIDIA web Driver 387.10.10.10.40.105
  • CUDA Driver 410.130
  • Xcode 8.3.2
  • python 3.6.6/2.7.15 (pyenv, pyenv-virtualenv)
  • CUDA Toolkit 10.0
  • cuDNN v7.3.1 (Sept 28, 2018), for CUDA 10.0
  • NCCL 2.3.5

超簡易的な手順

[ 環境セットアップ ]

  • [ install Homebrew ]
  • [ install pyenv, pyenv-virtualenv ]
    • brew install pyenv
    • brew install pyenv-virtualenv
  • [ install python 3.6.6 ]
    • pyenv install 3.6.6
    • pyenv virtualenv 3.6.6 tensorflow_python366
    • pyenv activate tensorflow_python366
    • pip install で、必要なモジュールをインストール

ここまでは、お好みの環境(仮想環境/ネイティブ)でセットアップ。
ネイティブの場合は、インストールで、適宜 sudo の使用。

  • [ download Tensorflow source code ]
    • git clone https://github.com/tensorflow/tensorflow
    • cd ./tensorflow
    • git checkout r1.10 (r.1.11 でも同様の手順でbuildできました。)
    • この時点で、tensorflow cifar10 が動作可能なことを確認しておく。
  • [ CUDA Toolkit 10.0 Download & install ]
  • [ Download cuDNN v7.3.1 (Sept 28, 2018), for CUDA 10.0 ]
    • cuDNN v7.3.1 Library for OSX
    • https://developer.nvidia.com/rdp/cudnn-download
    • tar xvf cudnn-10.0-osx-x64-v7.3.1.20.tar (cd ~/Downloads)
    • sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
    • sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib/
    • sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib/libcudnn*
  • [ Download NCCL v2.3.5, for CUDA 10.0, Sept 25, 2018 ]
    • https://developer.nvidia.com/nccl/nccl-download
    • NCCL 2.3.5 O/S agnostic and CUDA 10.0
    • tar xvf nccl_2.3.5-2+cuda10.0_x86_64.txz (cd ~/Downloads)
    • cd nccl_2.3.5-2+cuda10.0_x86_64/lib
    • sudo mv * /Developer/NVIDIA/CUDA-10.0/lib/
    • cd nccl_2.3.5-2+cuda10.0_x86_64/include
    • sudo mv nccl.h /Developer/NVIDIA/CUDA-10.0/include/
    • cd tensorflow/third_party/nccl/
    • ln -s /Developer/NVIDIA/CUDA-10.0/include/nccl.h
    • 注)third_party/nccl/nccl_configure.bzl に下記のようにあるが、設定するとlibが見つからずエラーになる
 `nccl_configure` depends on the following environment variables:

  * `TF_NCCL_VERSION`: The NCCL version.
  * `NCCL_INSTALL_PATH`: The installation path of the NCCL library.

code 修正

  • 特定のファイルの __align__(sizeof(T))の記述を削除

    • tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc
    • tensorflow/core/kernels/split_lib_gpu.cu.cc
    • tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc
  • linkopts = [“-lgomp”]を削除

    • tensorflow/third_party/gpus/cuda/BUILD.tpl
  • constexpr Variant() noexcept = default; // "constexpr" を削除

    • tensorflow/core/framework/variant.h

.bashrc

export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"

export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib:/Developer/NVIDIA/CUDA-10.0/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH
export PATH=/Developer/NVIDIA/CUDA-10.0/bin${PATH:+:${PATH}}

configure & build

  • SIPの無効化
  • ./configure

10.0
7.3.1
6.1

Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 10.0

Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.3.1

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 6.1
  • bazel build --config=cuda --config=opt --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package
Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
  bazel-bin/tensorflow/tools/pip_package/build_pip_package
INFO: Elapsed time: 4855.958s, Critical Path: 245.05s
INFO: 5073 processes: 5073 local.
INFO: Build completed successfully, 5262 total actions

install

  • ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg2

  • pip install /tmp/tensorflow_pkg/tensorflow-1.10.1-cp36-cp36m-macosx_10_13_x86_64.whl

cifar10_train.py

cd models/tutorials/image/cifar10
python cifar10_train.py

Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
2018-10-05 23:04:13.815498: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:858] OS X does not support NUMA - returning NUMA node zero
2018-10-05 23:04:13.815687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:46:00.0
totalMemory: 8.00GiB freeMemory: 5.71GiB
2018-10-05 23:04:13.815706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-10-05 23:04:14.187784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-05 23:04:14.187826: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0 
2018-10-05 23:04:14.187831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N 
2018-10-05 23:04:14.188000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5481 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:46:00.0, compute capability: 6.1)
2018-10-05 23:04:17.923348: step 0, loss = 4.68 (139.6 examples/sec; 0.917 sec/batch)
2018-10-05 23:04:18.492055: step 10, loss = 4.60 (2250.7 examples/sec; 0.057 sec/batch)
2018-10-05 23:04:18.914545: step 20, loss = 4.62 (3029.7 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:19.332597: step 30, loss = 4.60 (3061.8 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:19.756255: step 40, loss = 4.38 (3021.3 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:20.198415: step 50, loss = 4.42 (2894.9 examples/sec; 0.044 sec/batch)
2018-10-05 23:04:20.622042: step 60, loss = 4.40 (3021.5 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:21.040028: step 70, loss = 4.21 (3062.3 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:21.449060: step 80, loss = 4.19 (3129.3 examples/sec; 0.041 sec/batch)
2018-10-05 23:04:21.868983: step 90, loss = 4.12 (3048.2 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:22.424941: step 100, loss = 4.13 (2302.3 examples/sec; 0.056 sec/batch)
2018-10-05 23:04:22.865714: step 110, loss = 4.14 (2904.0 examples/sec; 0.044 sec/batch)
2018-10-05 23:04:23.308738: step 120, loss = 4.08 (2889.2 examples/sec; 0.044 sec/batch)
2018-10-05 23:04:23.761966: step 130, loss = 3.85 (2824.2 examples/sec; 0.045 sec/batch)
2018-10-05 23:04:24.201389: step 140, loss = 4.00 (2912.9 examples/sec; 0.044 sec/batch)
2018-10-05 23:04:24.621869: step 150, loss = 3.95 (3044.1 examples/sec; 0.042 sec/batch)
2018-10-05 23:04:25.062640: step 160, loss = 3.93 (2904.0 examples/sec; 0.044 sec/batch)
2018-10-05 23:04:25.516130: step 170, loss = 3.92 (2822.6 examples/sec; 0.045 sec/batch)
2018-10-05 23:04:25.967363: step 180, loss = 3.88 (2836.7 examples/sec; 0.045 sec/batch)
2018-10-05 23:04:26.433383: step 190, loss = 3.72 (2746.7 examples/sec; 0.047 sec/batch)
2018-10-05 23:04:27.021799: step 200, loss = 3.90 (2175.3 examples/sec; 0.059 sec/batch)

おまけ

python 2.7.15 の場合、mock を要求されたので、インストール。
それ以外は、同様に build できました。(NVIDIA のイーラーニングで必要だったので。)

pip install mock
3
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
6