More than 5 years have passed since last update.

TensorFlow > mapping > R^2のinputをR^2のoutputにmappingする実装 v0.2 > 学習したネットワークから(y_1, y_2)を再現

Last updated at 2017-05-14Posted at 2017-03-29

動作環境

GeForce GTX 1070 (8GB)
ASRock Z170M Pro4S [Intel Z170chipset]
Ubuntu 14.04 LTS desktop amd64
TensorFlow v0.11
cuDNN v5.1 for Linux
CUDA v8.0
Python 2.7.6
IPython 5.1.0 -- An enhanced Interactive Python.
gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
GNU bash, version 4.3.8(1)-release (x86_64-pc-linux-gnu)

概要

http://users.monash.edu/~app/Lrn/LearningMDS.pdf
の5.4 (Simple) example of function approximation。

R^2(2次元の実数)のinputをR^2のoutputにmappingしている。

TensorFlowで学習をさせて、学習結果を確認した。

学習しようとしている式

http://qiita.com/7of9/items/54ec092a91880df9dc64
の(x_1, x_2), (y_1, y_2)を学習させる。

Halton Sequenceコード v0.3

QMC(Quasi Monte Calro)で使うHalton Sequenceを生成するコード。

2次元(R^2)の均質な分布を得るために使用する。

UtilHaltonSequence.py

'''
v0.3 Mar. 20, 2017
    - change to a static function in a Class
v0.2 Oct. 22, 2016
    - implemented in Python
v0.1 Mar., 2005 or so
    - implemented in C
'''

# codingrule:PEP8


class CHaltonSequence:
    @staticmethod
    def calc_Halton_sequence(index):
        XBASE = 2
        YBASE = 3

        inv_xbase = 1.0 / XBASE
        fac_x = 1.0 / XBASE

        inv_ybase = 1.0 / YBASE
        fac_y = 1.0 / YBASE

        inp = index
        xwrk = 0.0
        while inp > 0:
            xwrk = xwrk + (inp % XBASE) * inv_xbase
            inp = inp / XBASE
            inv_xbase = inv_xbase * fac_x

        inp = index
        ywrk = 0.0
        while inp > 0:
            ywrk = ywrk + (inp % YBASE) * inv_ybase
            inp = inp / YBASE
            inv_ybase = inv_ybase * fac_y

        return xwrk, ywrk

'''
Usage example:

from UtilHaltonSequence import CHaltonSequence

for idx in range(0, 11):
    x0, y0 = CHaltonSequence.calc_Halton_sequence(index=idx)
    print(x0, y0)

'''

データ生成コード v0.1

Halton Sequenceを使うことで、R^2の入力を、より均一化するようにしている。

prep_data_170321.py

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import math
import sys
from UtilHaltonSequence import CHaltonSequence

'''
on Python 2.7.6

v0.1 Mar. 21, 2017
  - output to STD
  - add func2d()
  - add get_2d_QMC()
  - add get_2d_random_nums()
'''
# codingrule:PEP8


def get_2d_random_nums(xsize, ysize):
    MAXVAL_PLUS_ONE = 65536
    ints = np.random.randint(MAXVAL_PLUS_ONE, size=(xsize, ysize))
    # print(ints)

    flts = ints / float(MAXVAL_PLUS_ONE)
    # print(flts)
    return flts


def get_2d_QMC(xsize):
    alist = []
    for idx in range(0, xsize):
        xwrk, ywrk = CHaltonSequence.calc_Halton_sequence(index=idx)
        alist.append([xwrk, ywrk])
    # print(alist)
    return np.array(alist)


def func2d(x1, x2):
    rho = x1 ** 2 + x2 ** 2
    epsilon = 10.**(-10)
    if abs(rho) < epsilon:
        return (0.0, 0.0)

    y1 = x1 * math.exp(-rho ** 2)
    try:
        # print(rho) # debug
        y2 = math.sin(2.0 * rho ** 2) / 4.0 / rho ** 2
    except ZeroDivisionError:
        y2 = 0.0
    return (y1, y2)

NUM_SAMPLES = 100

# 1. get random numbers in 2D
#
# a. np.random.randint()
# arr_2d = get_2d_random_nums(NUM_SAMPLES,2) # [0,1)
#
# b. QMC using Halton Sequence
arr_2d = get_2d_QMC(NUM_SAMPLES)  # [0, 1)

for x1, x2 in arr_2d:
    y1, y2 = func2d(x1, x2)
    print('%.7f, %.7f, %.7f, %.7f' % (x1, x2, y1, y2))
    # sys.exit()  # debug

$ python prep_data_170321.py > input.csv
$ head input.csv 
0.0000000, 0.0000000, 0.0000000, 0.0000000
0.5000000, 0.3333333, 0.4388716, 0.4943511
0.2500000, 0.6666667, 0.1933435, 0.4782739
0.7500000, 0.1111111, 0.5389515, 0.4643882
0.1250000, 0.4444444, 0.1194477, 0.4993122
0.6250000, 0.7777778, 0.2319694, 0.2311777
0.3750000, 0.2222222, 0.3617029, 0.4995656
0.8750000, 0.5555556, 0.2759375, 0.1603667
0.0625000, 0.8888889, 0.0332709, 0.3776411
0.5625000, 0.0370370, 0.5084711, 0.4966077

学習コード v0.2

~~後述の学習結果(y_1, y_2)の図示のPythonスクリプトに合わせて、~~ capacity=40からcapacity=100に変更している。
学習結果(weight, bias)はmodel_variables_170321.npyに出力している。

learn_xxyyfunc_170321.py

# !/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import tensorflow as tf
import tensorflow.contrib.slim as slim
import numpy as np

'''
v0.2 Mar. 24, 2017
  - change [capacity] from 40 to 100
  - output [model_variables] after training
v0.1 Mar. 22, 2017
  - learn mapping of R^2 input to R^2 output
     + using data prepared by [prep_data_170321.py]
  - branched from sine curve learning at
    http://qiita.com/7of9/items/ce58e66b040a0795b2ae
'''

# codingrule:PEP8


filename_queue = tf.train.string_input_producer(["input.csv"])

# prase CSV
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
input1, input2, output1, output2 = tf.decode_csv(
    value, record_defaults=[[0.], [0.], [0.], [0.]])
inputs = tf.pack([input1, input2])
output = tf.pack([output1, output2])

batch_size = 4  # [4]
inputs_batch, output_batch = tf.train.shuffle_batch(
    [inputs, output], batch_size, capacity=100, min_after_dequeue=batch_size)

input_ph = tf.placeholder("float", [None, 2])
output_ph = tf.placeholder("float", [None, 2])

## network
hiddens = slim.stack(input_ph, slim.fully_connected, [7, 7, 7],
                     activation_fn=tf.nn.sigmoid, scope="hidden")
# prediction = slim.fully_connected(
#    hiddens, 2, activation_fn=tf.nn.sigmoid, scope="output")
prediction = slim.fully_connected(
    hiddens, 2, activation_fn=None, scope="output")
loss = tf.contrib.losses.mean_squared_error(prediction, output_ph)

train_op = slim.learning.create_train_op(loss, tf.train.AdamOptimizer(0.001))

init_op = tf.initialize_all_variables()

with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    try:
        sess.run(init_op)
        for i in range(10000):  # [10000]
            inpbt, outbt = sess.run([inputs_batch, output_batch])
            _, t_loss = sess.run([train_op, loss],
                                 feed_dict={input_ph: inpbt, output_ph: outbt})

            if (i+1) % 100 == 0:
                print("%d,%f" % (i+1, t_loss))

    finally:
        coord.request_stop()

    # output the model
	model_variables = slim.get_model_variables()
	res = sess.run(model_variables)
	np.save('model_variables_170321.npy', res)

    coord.join(threads)

$ python learn_xxyyfunc_170321.py > res.2in2out_170323

lossの推移確認

Jupyterコード

check_result_170322.ipynb

%matplotlib inline

# mapping of R^2 to R^2
# Last update: Mar. 30, 2017

import numpy as np
import matplotlib.pyplot as plt

# data1 = np.loadtxt('res.2in2out_170322', delimiter=',') # capacity = 40
data1 = np.loadtxt('res.2in2out_170323', delimiter=',') # capacity = 100

input1 = data1[:,0]
output1 = data1[:,1]

fig = plt.figure()
ax1 = fig.add_subplot(2,1,1)

ax1.plot(input1, output1)

ax1.set_xlabel('idx')
ax1.set_ylabel('loss')
ax1.set_ylim([0,0.02])
ax1.grid(True)

fig.show()

capacityを変更したためか、v0.1の時と比べて、lossの推移が悪くなっている。

学習結果の図示

学習したネットワークに(x_1, x_2)を与えて、(y_1, y_2)を計算する。

Jupyter コード。

check_resultmap_170329

'''
v0.1 Mar. 29, 2017
    - move to matplotlib on Jupyter
    - modify calc_conv() for 2 input nodes
=== [reproduce_sine.py] branched to [reproduce_xxxfunc_170328.py] ===
v0.3 Dec. 11, 2016
    - add output_debugPrint()
    - fix bug > calc_sigmoid() was using positive for exp()
v0.2 Dec. 10, 2016
    - calc_conv() takes [applyActFnc] argument
v0.1 Dec. 10, 2016
    - add calc_sigmoid()
    - add fully_connected network
    - add input data for sine curve
=== [read_model_var.py] branched to [reproduce_sine.py] ===

v0.4 Dec. 10, 2016
    - add 2x2 network example
v0.3 Dec. 07, 2016
    - calc_conv() > add bias
v0.2 Dec. 07, 2016
    - fix calc_conv() treating src as a list
v0.1 Dec. 07, 2016
    - add calc_conv()
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import math
import sys
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from UtilHaltonSequence import CHaltonSequence


# to ON/OFF debug print at one place
def output_debugPrint(str):
    #print(str)
    pass  # no operation

def calc_sigmoid(x):
    return 1.0 / (1.0 + math.exp(-x))


def calc_conv(src, weight, bias, applyActFnc):
    wgt = weight.shape
#   print wgt # debug
    #conv = list(range(bias.size))
    conv = [0.0] * bias.size

    # weight
    for idx1 in range(wgt[0]):
        for idx2 in range(wgt[1]):
            conv[idx2] = conv[idx2] + src[idx1] * weight[idx1, idx2]
    # bias
    for idx2 in range(wgt[1]):
        conv[idx2] = conv[idx2] + bias[idx2]
    # activation function
    if applyActFnc:
        for idx2 in range(wgt[1]):
            conv[idx2] = calc_sigmoid(conv[idx2])
    return conv  # return list


def get_2d_QMC(xsize):
    alist = []
    for idx in range(0, xsize):
        xwrk, ywrk = CHaltonSequence.calc_Halton_sequence(index=idx)
        alist.append([xwrk, ywrk])
    # print(alist)
    return np.array(alist)


def func2d(x1, x2):
    rho = x1 ** 2 + x2 ** 2
    epsilon = 10.**(-10)
    if abs(rho) < epsilon:
        return (0.0, 0.0)

    y1 = x1 * math.exp(-rho ** 2)
    try:
        # print(rho) # debug
        y2 = math.sin(2.0 * rho ** 2) / 4.0 / rho ** 2
    except ZeroDivisionError:
        y2 = 0.0
    return (y1, y2)


model_var = np.load('model_variables_170321.npy')

# output_debugPrint(("all shape:", (model_var.shape)))

NUM_SAMPLES = 100
inpdata = get_2d_QMC(NUM_SAMPLES)  # [0, 1)

prd_y1_1d = np.array([])
prd_y2_1d = np.array([])
ans_y1_1d = np.array([])
ans_y2_1d = np.array([])

for x1idx in range(0, 20, 2):
    for x2idx in range(0, 20, 2):
        inlist = (x1idx/20.0, x2idx/20.0)
        # input layer (2 node)
        #
        # hidden layer 1 (7 node)
        outdata = calc_conv(inlist, model_var[0], model_var[1], applyActFnc=True)
        # hidden layer 2 (7 node)
        outdata = calc_conv(outdata, model_var[2], model_var[3], applyActFnc=True)
        # hidden layer 3 (7 node)
        outdata = calc_conv(outdata, model_var[4], model_var[5], applyActFnc=True)
        # output layer (2 node)
        outdata = calc_conv(outdata, model_var[6], model_var[7], applyActFnc=False)
        prd_y1_1d = np.append(prd_y1_1d, outdata[0])
        prd_y2_1d = np.append(prd_y2_1d, outdata[1])

        answer = func2d(inlist[0], inlist[1])
        ans_y1_1d = np.append(ans_y1_1d, answer[0])
        ans_y2_1d = np.append(ans_y2_1d, answer[1])
        #print('%s %s %s' % (inlist, outdata, answer))
        
    
siz = 10
plt.subplot(223)
dat_2d = np.reshape(prd_y2_1d, (siz,siz))
plt.imshow(dat_2d, extent=(0, siz, 0, siz), cmap=cm.gist_rainbow)

plt.subplot(224)
dat_2d = np.reshape(ans_y2_1d, (siz,siz))
plt.imshow(dat_2d, extent=(0, siz, 0, siz), cmap=cm.gist_rainbow)

plt.subplot(221)
dat_2d = np.reshape(prd_y1_1d, (siz,siz))
plt.imshow(dat_2d, extent=(0, siz, 0, siz), cmap=cm.gist_rainbow)

plt.subplot(222)
dat_2d = np.reshape(ans_y1_1d, (siz,siz))
plt.imshow(dat_2d, extent=(0, siz, 0, siz), cmap=cm.gist_rainbow)

plt.show()

左の２つが学習したネットワークに(x_1, x_2)を与えて得た(y_1, y_2)。
右の２つがトレーニングデータ（正解）。

lossの変化で見たように、学習精度はまだまだ悪い。

備考

Halton Sequenceを使ってはいるが、学習時にbatchで入力を得ている。
これにより、Halton Sequenceの収束性を得ていない可能性はある。

Halton Sequenceの値をそのまま使っているため、範囲が[0,1)になっていた。
本来はこれを線形変換して[-2,2)のように使う。今後のTODO。

capacityを変更してみた

learn_xxyyfunc_170321.pyの以下の部分でcapacityを40に戻してみた。

batch_size = 4  # [4]
inputs_batch, output_batch = tf.train.shuffle_batch(
    [inputs, output], batch_size, capacity=40, min_after_dequeue=batch_size)