Jetson Nanoセットアップ後のTensorFlowインストールと動作確認

Last updated at Posted at 2019-05-19


Raspberry PIと違うのはGPUを搭載している点であるので、やはり機械学習を試してみたい。しかし与えられているサンプルは高度すぎるので、まずは単純なAND回路を作ってみる。


Jetson Nano用にオフィシャルなTensorFlow1があるのでインストールする。

sudo apt-get install python3-pip libhdf5-serial-dev hdf5-tools
pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.13.1+nv19.4 --user


python3 -c 'import tensorflow; print(tensorflow.__version__)'

現時点では 1.13.1 がインストールされた。

他に関連するパッケージではTensorBoard, Estimatorがインストールされる。TensorRTは使ったことがないので別途調査する予定。

$ pip3 freeze | grep tensor



import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Model
model = Sequential()
model.add(Dense(1, input_shape=(2, ), activation='sigmoid'))
model.compile(loss='mse', optimizer='adam', metrics=['acc'])

# Training
x_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) # N x 2
y_train = np.array([0, 0, 0, 1]).reshape(-1, 1) # N x 1
model.fit(x_train, y_train, epochs=3000, verbose=True)

# Evaluation
x_test = x_train
y_test = y_train
score = model.evaluate(x_test, y_test, verbose=False)
print('Test score:', score[0])
print('Test accuracy:', score[1])



$ python3 and.py 
WARNING:tensorflow:From /home/yamamo-to/.local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/yamamo-to/.local/lib/python3.6/site-packages/tensorflow/python/keras/utils/losses_utils.py:170: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 1)                 3         
Total params: 3
Trainable params: 3
Non-trainable params: 0
WARNING:tensorflow:From /home/yamamo-to/.local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-05-20 01:36:57.560684: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2019-05-20 01:36:57.561667: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x15dbc480 executing computations on platform Host. Devices:
2019-05-20 01:36:57.561737: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): <undefined>, <undefined>
2019-05-20 01:36:57.627701: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero
2019-05-20 01:36:57.627993: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x15e9b700 executing computations on platform CUDA. Devices:
2019-05-20 01:36:57.628052: I tensorflow/compiler/xla/service/service.cc:168]   StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3
2019-05-20 01:36:57.628401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
totalMemory: 3.87GiB freeMemory: 598.47MiB
2019-05-20 01:36:57.628476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-05-20 01:36:58.625436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-20 01:36:58.625521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-05-20 01:36:58.625559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-05-20 01:36:58.625749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 130 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
Epoch 1/3000
2019-05-20 01:36:59.260357: I tensorflow/stream_executor/dso_loader.cc:153] successfully opened CUDA library libcublas.so.10.0 locally
4/4 [==============================] - 1s 204ms/sample - loss: 0.2856 - acc: 0.5000
Epoch 2/3000
4/4 [==============================] - 0s 2ms/sample - loss: 0.2853 - acc: 0.5000
Epoch 2999/3000
4/4 [==============================] - 0s 2ms/sample - loss: 0.0840 - acc: 1.0000
Epoch 3000/3000
4/4 [==============================] - 0s 2ms/sample - loss: 0.0840 - acc: 1.0000
Test score: 0.08399419486522675
Test accuracy: 1.0


実行中、電力は実行前 3.8W だったのが 5.8〜6.0W と 2W 程度の増加だった。実行時間はtimeコマンドで計測すると以下の通り。

$ time python3 add.py
real    0m47.031s
user    1m2.516s
sys 0m9.184s

