動作環境
GeForce GTX 1070 (8GB)
ASRock Z170M Pro4S [Intel Z170chipset]
Ubuntu 16.04 LTS desktop amd64
TensorFlow v1.1.0
cuDNN v5.1 for Linux
CUDA v8.0
Python 3.5.2
IPython 6.0.0 -- An enhanced Interactive Python.
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)
概要
TensorFlowで学習したネットワークのweightとbiasを使った計算に関して、実装の違いで処理時間の比較をしている。
関連: http://qiita.com/7of9/items/f267e59790526e49a7fd
比較対象
# weight
for idx1 in range(wgt[0]):
for idx2 in range(wgt[1]):
conv[idx2] = conv[idx2] + src[idx1] * weight[idx1, idx2]
と
# weight
for idx2 in range(wgt[1]):
tmp_vec = weight[:,idx2] * src[:]
conv[idx2] = tmp_vec.sum()
全コード
Jupyter code.
profile_calc_conv_170722.ipynb
%%timeit
import numpy as np
import math
import sys
def calc_conv1(src, weight, bias, applyActFnc):
wgt = weight.shape
conv = [0.0] * bias.size
# weight
for idx1 in range(wgt[0]):
for idx2 in range(wgt[1]):
conv[idx2] = conv[idx2] + src[idx1] * weight[idx1, idx2]
# bias
for idx2 in range(wgt[1]):
conv[idx2] = conv[idx2] + bias[idx2]
# activation function
if applyActFnc:
for idx2 in range(wgt[1]):
conv[idx2] = calc_sigmoid(conv[idx2])
return conv # return list
def calc_conv3(src, weight, bias, applyActFnc):
wgt = weight.shape
conv = [0.0] * bias.size
# weight
for idx2 in range(wgt[1]):
tmp_vec = weight[:,idx2] * src[:]
conv[idx2] = tmp_vec.sum()
# bias
for idx2 in range(wgt[1]):
conv[idx2] = conv[idx2] + bias[idx2]
# activation function
if applyActFnc:
for idx2 in range(wgt[1]):
conv[idx2] = calc_sigmoid(conv[idx2])
return conv # return list
INP_NODE = 100
weight = np.random.randn(INP_NODE,2)
src = np.random.randn(INP_NODE)
bias = np.random.randn(100)
res1 = calc_conv1(src, weight, bias, applyActFnc=False)
#res3 = calc_conv3(src, weight, bias, applyActFnc=False)
#for elem in zip(res1, res3):
# #if (elem[0] > 0.0):
# print(elem)
結果
- calc_conv1
- 84.5 µs ± 120 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
- calc_conv3
- 24.1 µs ± 502 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
速くなった。
答えは一応同じはず。