Qiita Teams that are logged in
You are not logged in to any team

Community
Service
Qiita JobsQiita ZineQiita Blog
13
Help us understand the problem. What is going on with this article?
@hiroyuki827

# 簡単な分類問題を解いてみませんか？

More than 3 years have passed since last update.

# はじめに

「ゼロから作るDeep Learning」ではTensorflowの内容をNumpyで実装していくスタンスが取られています。この記事ではそれに倣い、簡単な分類問題をNumpyで実装する問題を考えます。

- ゼロから作るDeep Learning

# 問題

Suppose we want to classify some data (4 samples) into 3 distinct classes: 0, 1, and 2.
We have set up a network with a pre-activation output z in the last layer.
Applying softmax will give the final model output.
input X ---> some network --> z --> y_model = softmax(z)

We quantify the agreement between truth (y) and model using categorical cross-entropy.
J = - sum_i (y_i * log(y_model(x_i))

In the following you are to implement softmax and categorical cross-entropy
and evaluate them values given the values for z.

Tensorflowの実装は以下の通りすでにされているので、同じ挙動をするプログラムをNumpyだけで実装してください。（参考書籍の3.5節が参考になると思います。) 問題は1)から5)まであります。


from __future__ import print_function
import numpy as np
import tensorflow as tf

# Data: 4 samples with the following class labels (input features X irrelevant here)
y_cl = np.array([0, 0, 2, 1])

# output of the last network layer before applying softmax
z = np.array([
[  4,   5,   1],
[ -1,  -2,  -3],
[0.1, 0.2, 0.3],
[ -1, 100,   1]
])

# TensorFlow implementation as reference. Make sure you get the same results!
print('\nTensorFlow ------------------------------ ')
with tf.Session() as sess:
z_ = tf.constant(z, dtype='float64')
y_ = tf.placeholder(dtype='float64', shape=(None,3))

y = np.array([[1., 0., 0.], [1., 0., 0.], [0., 0., 1.], [0., 1., 0.]])
print('one-hot encoding of data labels')
print(y)

y_model = tf.nn.softmax(z)
crossentropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_model), reduction_indices=[1]))

print('softmax(z)')
print(sess.run(y_model))

print('cross entropy = %f' % sess.run(crossentropy, feed_dict={y_: y}))

print('\nMy solution ------------------------------ ')
# 1) Write a function that turns any class labels y_cl into one-hot encodings y. (2 points)
#    0 --> (1, 0, 0)
#    1 --> (0, 1, 0)
#    2 --> (0, 0, 1)
#    Make sure that np.shape(y) = (4, 3) for np.shape(y_cl) = (4).

def to_onehot(y_cl, num_classes):
y = np.zeros((len(y_cl), num_classes))
return y

# 2) Write a function that returns the softmax of the input z along the last axis. (2 points)
def softmax(z):
return None

# 3) Compute the categorical cross-entropy between data and model (2 points)

# 4) Which classes are predicted by the model (maximum entry). (1 point)

# 5) How many samples are correctly classified (accuracy)? (1 point)



# 解答

from __future__ import print_function
import numpy as np
import tensorflow as tf

# Data: 4 samples with the following class labels (input features X irrelevant here)
y_cl = np.array([0, 0, 2, 1])

# output of the last network layer before applying softmax
z = np.array([
[  4,   5,   1],
[ -1,  -2,  -3],
[0.1, 0.2, 0.3],
[ -1, 100,   1]
])

# Tensorflowパートは省略

print('\n☆My solution ------------------------------ ')
# 1) Write a function that turns any class labels y_cl into one-hot encodings y. (2 points)
#    0 --> (1, 0, 0)
#    1 --> (0, 1, 0)
#    2 --> (0, 0, 1)
#    Make sure that np.shape(y) = (4, 3) for np.shape(y_cl) = (4).

def to_onehot(num_classes, y_cl):
y_one = np.eye(num_classes)[y_cl]
return y_one

print('one-hot encoding of data labels by Numpy')
y_one = (to_onehot(3,y_cl)).astype(np.float32)
print(y_one)

#2) Write a function that returns the softmax of the input z along the last axis. (2 points)
def softmax(z):
e = np.exp(z)
dist = e / np.sum(e, axis=1, keepdims=True)
return dist

print('softmax(z) by Numpy')
y_my = softmax(z)
print(y_my)

# 3) Compute the categorical cross-entropy between data and model (2 points)
crossentropy_my = np.mean(-np.sum(y_one*np.log(y_my),axis=1))
print('cross entropy by Numpy: %f' % crossentropy_my)

# 4) Which classes are predicted by the model (maximum entry). (1 point)
print('The predicted class by Numpy:')
y_pre_cl= np.argmax(y_my,axis=1)
print(y_pre_cl)

# 5) How many samples are correctly classified (accuracy)? (1 point)
accuracy_my = np.mean(y_pre_cl == y_cl)
print('accuracy by Numpy: %f' % accuracy_my)


■Input data with 4 samples:[0 0 2 1]

☆TensorFlow ------------------------------
one-hot encoding of data labels
[[ 1.  0.  0.]
[ 1.  0.  0.]
[ 0.  0.  1.]
[ 0.  1.  0.]]
softmax(z)
[[  2.65387929e-01   7.21399184e-01   1.32128870e-02]
[  6.65240956e-01   2.44728471e-01   9.00305732e-02]
[  3.00609605e-01   3.32224994e-01   3.67165401e-01]
[  1.36853947e-44   1.00000000e+00   1.01122149e-43]]
cross entropy: 0.684028
The predicted class:
[1 0 2 1]
accuracy: 0.750000

☆My solution ------------------------------
one-hot encoding of data labels by Numpy
[[ 1.  0.  0.]
[ 1.  0.  0.]
[ 0.  0.  1.]
[ 0.  1.  0.]]
softmax(z) by Numpy
[[  2.65387929e-01   7.21399184e-01   1.32128870e-02]
[  6.65240956e-01   2.44728471e-01   9.00305732e-02]
[  3.00609605e-01   3.32224994e-01   3.67165401e-01]
[  1.36853947e-44   1.00000000e+00   1.01122149e-43]]
cross entropy by Numpy: 0.684028
The predicted class by Numpy:
[1 0 2 1]
accuracy by Numpy: 0.750000


# 解説

## 問題1.

1) Write a function that turns any class labels y_cl into one-hot encodings y. (2 points)
0 --> (1, 0, 0)
1 --> (0, 1, 0)
2 --> (0, 0, 1)
Make sure that np.shape(y) = (4, 3) for np.shape(y_cl) = (4).

def to_onehot(num_classes, y_cl):
y_one = np.eye(num_classes)[y_cl]
return y_one


Numpy arrayy_clをよみこむことで、y_clの要素の位置に1が入ったarrayが得られる。num_classy_clの最大値-最小値+1でなければならない。(今回は0,1,2なので３つ。) これがone-hotベクトル。

## 問題2.

2) Write a function that returns the softmax of the input z along the last axis. (2 points)

def softmax(z):
e = np.exp(z)
dist = e / np.sum(e, axis=1, keepdims=True)
return dist


## 問題3.

3) Compute the categorical cross-entropy between data and model (2 points)

crossentropy_my = np.mean(-np.sum(y_one*np.log(y_my),axis=1))
print('cross entropy by Numpy: %f' % crossentropy_my)


categorical cross-entropyの定義

$$E = - \sum_{k}t_{k} \log{y_k}$$

にしたがって実装すれば良い。ここで$t_k$はone-hotベクトル、$y_k$は出力。

なお、tf.reduce_meannp.meanは同じもので、Numpyでのaxisについてはこちらを参考にする。今回は分類問題を考えるので、arrayのなかのarray (たとえば[[A,B,C],[D,E,F]]とすると[A,B,C][D,E,F]に相当するarray) は別々に考える必要がある。この場合はaxis=1を指定すれば、この範囲での操作が可能になる。

ここまでのことを整理しておくと
- y_cl(入力値) をone-hotベクトルに直した。 -> y_one
- 層を経て出てきたarrayzをsoftmax関数に通した -> y_my

これらは別々のもので、それぞれ$t_k$, $y_k$に対応している。

## 問題4.

4) Which classes are predicted by the model (maximum entry). (1 point)

print('The predicted class by Numpy:')
y_pre_cl = np.argmax(y_my,axis=1)
print(y_pre_cl)


### one-hotベクトルについての簡単な解説

[0.10, 0.05, 0.85]


なるリストが得られたとします。(これは今の問題の softmax(z) の適当に取ってきたarrayの一つに対応します。) ここでnp.argmax([0.10, 0.05, 0.85])=[2]なので、結果として「このarrayが指すラベルは、カテゴリー2に属する確率が0.85x100=85%程度であるらしい」ことがわかります。

このように添字を一番大きいところを取ればそのカテゴリーがわかります。逆に言えば、正解ラベルだけ任意の数字を振ってほか特別できるようにしておけば、そのラベルを見つければ良いのでコンピュータもそれがカテゴリーであることがわかりますよね。と考えると、正解ラベルに1, それ以外に0を降っておけば、1がある場所その添字を見れば正解かどうか分かることになります。これをone-hotベクトルと言います。

[ [1, 0, 0], # 0
[1, 0, 0], # 0
[0, 0, 1], # 2
[0, 1, 0] ] # 1


となります。(細かいnotationは無視してください。)

## 問題5.

5) How many samples are correctly classified (accuracy)? (1 point)

accuracy_my = np.mean(y_pre_cl == y_cl)
print('accuracy by Numpy: %f' % accuracy_my)


# おわりに

13
Help us understand the problem. What is going on with this article?
Why not register and get more from Qiita?
1. We will deliver articles that match you
By following users and tags, you can catch up information on technical fields that you are interested in as a whole
2. you can read useful information later efficiently
By "stocking" the articles you like, you can search right away
SDET 興味ある言語: Python, JavaScript その他興味: メインフレーム, Zowe, VSCode Plugin Development Certified Jenkins Engineer取得 *おことわり* Qiitaでの投稿内容は私自身の見解であり、所属会社の立場、戦略、意見を代表するものではありません。