More than 5 years have passed since last update.

ロジスティック回帰(Logistic Regression)

Last updated at 2019-12-03Posted at 2019-07-08

はじめに

これは筆者の勉強まとめページですので、指摘しまくってい頂けると幸いです

ロジスティック回帰

線形回帰を二値分類に使用する方法で、とある閾値を超えたものを1クラス、その他のクラスを0クラスとして分類していく手法で、今回は0.5を閾値として、損失関数が以下のように設定される。

$$ y = w・x + b $$

$$ sigmoid(x) = \frac{1}{1 + \exp(-x)} $$

$$ loss = \frac{1}{n}\sum^{n}_{k = 1}{(t・log(sigmoid(y) + (1 - t)・log(1 - sigmoid(y))} $$

これを用いて2クラス分類をしていきます


import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from sklearn import datasets

sess = tf.Session()

# [setosa, versicolor] と [virginica] の分類を行う

iris = datasets.load_iris()
x_vals = iris.data
target = iris.target

y1 = [0 for i in target if i != 2]
y2 = [1 for i in target if i == 2]

y_vals = np.array(y1+y2)

learning_rate = 0.05
batch_size = 25

x_data = tf.placeholder(shape = [None, 4], dtype = tf.float32)
y_target = tf.placeholder(shape = [None, 1], dtype = tf.float32)

A = tf.Variable(tf.random_normal(shape = [4, 1]))
b = tf.Variable(tf.random_normal(shape = [1, 1]))

model_output = tf.add(tf.matmul(x_data, A), b)

loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits = model_output, labels = y_target))

init = tf.global_variables_initializer()
sess.run(init)

optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(loss)

prediction = tf.round(tf.sigmoid(model_output))
prediction_correct = tf.cast(tf.equal(prediction, y_target), tf.float32)
accuracy = tf.reduce_mean(prediction_correct)

loss_vec = []
accuracy_vec = []

for i in range(1000):
    
    rand_index = np.random.choice(len(x_vals), size = batch_size)
    rand_x = x_vals[rand_index]
    rand_y = np.transpose([y_vals[rand_index]])
    
    sess.run(train, feed_dict = {x_data: rand_x, y_target: rand_y})
    
    tmp_accuracy, temp_loss = sess.run([accuracy, loss], feed_dict = {x_data: rand_x, y_target: rand_y})
    
    loss_vec.append(temp_loss)
    accuracy_vec.append(tmp_accuracy)
    
    if (i + 1) % 25 == 0:
        
        print("Step #" + str(i + 1) + " A = " + str(sess.run(A)) + " b = " + str(sess.run(b)))
        print("Loss = " + str(temp_loss))
        print("Acc = " + str(tmp_accuracy))

plt.plot(loss_vec, "k-")
plt.title("L2 Loss per Generation")
plt.xlabel("Generation")
plt.ylabel("L2 Loss")
plt.show()

plt.plot(accuracy_vec, "k-")
plt.title("L2 accuracy per Generation")
plt.xlabel("Generation")
plt.ylabel("L2 accuracy")
plt.show()

こんな感じで学習が進んでいれば成功
うまく分類できている模様(テストデータで検証してないけど...)
ロジスティック回帰で他クラスにする手法とかないかな(ボソッ)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up