機械学習ORゲート（自分用）

Python

Posted at 2026-04-22

大学の授業で作成したので。（今後のpython及び機械学習の基礎となってくれる予感がしている。）

1. 準備

特に気にせずimport.

import matplotlib.pyplot as plt
import japanize_matplotlib
import numpy as np

2. 概要

ORゲートを、機械学習により作成する。$([0,1],[1,0],[0,0],[1,1])$ に対して $y=(1,1,0,1)$ を出力するようなモデルを作成したい。目的変数はベクトル化して、

x = \begin{pmatrix}0\\1\end{pmatrix} \text{or} \begin{pmatrix}1\\0\end{pmatrix} \text{or} \begin{pmatrix}0\\0\end{pmatrix} \text{or} \begin{pmatrix}1\\1\end{pmatrix}

で、説明変数は

y = \begin{pmatrix}y_0\\y_1\end{pmatrix}\; \text{ただし} \; y_0,y_1 \in [0,1]

教師ラベルとしては、

\begin{aligned}
x = \begin{pmatrix}0\\1\end{pmatrix} \leftrightarrow y = \begin{pmatrix}0\\1\end{pmatrix}
\\[3pt]

x = \begin{pmatrix}1\\0\end{pmatrix} \leftrightarrow y = \begin{pmatrix}0\\1\end{pmatrix}
\\[3pt]

x = \begin{pmatrix}0\\0\end{pmatrix} \leftrightarrow y = \begin{pmatrix}1\\0\end{pmatrix}
\\[3pt]

x = \begin{pmatrix}1\\1\end{pmatrix} \leftrightarrow y = \begin{pmatrix}0\\1\end{pmatrix}
\end{aligned}\\[3pt]

またモデルとしては

\hat{\boldsymbol{y}} = \sigma\left(W^T \boldsymbol{x} + \boldsymbol{b}\right)\quad , \quad\left(\text{ただし}\; \sigma(\boldsymbol{x}) = \dfrac{\exp(x)}{1+\exp(x)}\right)

即ち今やろうとしていることは、学習により$W,\boldsymbol{b}$を適切に調整することで、入力$\boldsymbol{x}$に対し限りなく教師ラベルに近い$y$を出力させよう、と言うことになる。（即ち、例えば

x = \begin{pmatrix}0\\0\end{pmatrix} \rightarrow \hat{y} = \begin{pmatrix}0.99\\0.01\end{pmatrix}

）

3. アルゴリズム

ロジスティック回帰を用いる。アルゴリズムは勾配降下法。まず目的関数は
$$
E = -\dfrac{1}{N}\displaystyle \sum_{n}^N \displaystyle\sum_{k}^K　y_{nk}\log\hat{y}_{nk}
$$
ただし$N$はデータ数で今回の場合は$N=4$、$K$は分類問題のラベル数であり今回の場合は$y_0,y_1$に対応して$K=2$。

　さて、今回$y_0 + y_1 = 1$を満たすことから$\boldsymbol{y}$ではなく$y_1$について学習すれば良い。即ち改めて、

目的変数：

x = \begin{pmatrix}0\\1\end{pmatrix} \text{or} \begin{pmatrix}1\\0\end{pmatrix} \text{or} \begin{pmatrix}0\\0\end{pmatrix} \text{or} \begin{pmatrix}1\\1\end{pmatrix}

説明変数：
$$
y = y_1 \leadsto y; \text{ただし} ; y \in [0,1]
$$
教師ラベル：

\begin{aligned}
x = \begin{pmatrix}0\\1\end{pmatrix} \leftrightarrow y=1
\\[3pt]
x = \begin{pmatrix}1\\0\end{pmatrix} \leftrightarrow y=1
\\[3pt]
x = \begin{pmatrix}0\\0\end{pmatrix} \leftrightarrow y=0
\\[3pt]
x = \begin{pmatrix}1\\1\end{pmatrix} \leftrightarrow y=1
\end{aligned}\\[3pt]

モデル；
$$
\hat{y} = \sigma\left(W^T \boldsymbol{x} + \boldsymbol{b}\right)\quad , \quad\left(W\in M^{2\times 1},b\in \mathbb{R}\right)
$$
目的関数:

E = -\dfrac{1}{N}\displaystyle\sum_{n=1}^4 \left\{y_n \log \hat{y_n}  +(1-y_n)\log \left(1-\hat{y_n}\right)\right\}

この時勾配の計算及びパラメタの更新は：($i\leftrightarrow n$)

\begin{align*}
    \delta_i &= {\bf \hat{y}}_i - {\bf y}_i \\
    \nabla_{\bf W} E &= \frac{1}{N}\sum^N_{i=1}\delta_i {\bf x}^{\mathrm{T}}_i \\
    \nabla_{\bf b} E &= \frac{1}{N}\sum^N_{i=1}\delta_i  \\
    {\bf W} &\leftarrow {\bf W} - \epsilon \nabla_{\bf W} E \\
    {\bf b} &\leftarrow {\bf b} - \epsilon \nabla_{\bf b} E \\
\end{align*}

4.1 シグモイド関数の実装

$\sigma$ を実装する。オーバーフローを考慮した形にする必要がある。

\sigma(\boldsymbol{x}) = \dfrac{\exp(x)}{1+\exp(x)}

def sigmoid(x):
    return np.exp(np.minimum(x,0))/(1+np.exp(-np.abs(x)))


#　以下可視化
x = np.arange(-10,10,0.1)
plt.plot(x,sigmoid(x.copy()))

4.2 （教師）データの実装

※今回は教師データもテストデータも同じ。

x_train = np.array([[0,1],[1,0],[0,0],[1,1]])
y_train = np.array([[1],[1],[0],[1]])
x_test = x_train
y_test = y_train

# 以下可視化
plt.scatter(*x_train[y_train.squeeze()==0].T,color ="red")
plt.scatter(*x_train[y_train.squeeze()==1].T,color ="blue")
plt.xlim([-1,2])
plt.ylim([-1,2])
plt.show()

4.3 目的関数などの設定

目的関数：
$$ E ({\bf x}, {\bf y}; {\bf W}, {\bf b} ) = -\frac{1}{N}\sum^N_{i=1} \left[ {\bf y}_i \log {\bf \hat{y}}_i ({\bf x}_i; {\bf W}, {\bf b}) + (1 - {\bf y}_i) \log { 1 - {\bf \hat{y}}_i ({\bf x}_i; {\bf W}, {\bf b}) }\right] $$
モデル；

\hat{y} = \sigma\left(W^T \boldsymbol{x} + \boldsymbol{b}\right)\quad , \quad\left(W\in M^{2\times 1},b\in \mathbb{R}\right)

4.4 学習ルーチン

学習のサブルーチンと、テストルーチン

\begin{align*}
    \delta_i &= {\bf \hat{y}}_i - {\bf y}_i \\
    \nabla_{\bf W} E &= \frac{1}{N}\sum^N_{i=1}\delta_i {\bf x}^{\mathrm{T}}_i \\
    \nabla_{\bf b} E &= \frac{1}{N}\sum^N_{i=1}\delta_i  \\
    {\bf W} &\leftarrow {\bf W} - \epsilon \nabla_{\bf W} E \\
    {\bf b} &\leftarrow {\bf b} - \epsilon \nabla_{\bf b} E \\
\end{align*}

def train(x,y,eps): # epsは学習効率を決めるパラメタ
    global W,b
    batchsize = x.shape[0]
    y_hat = sigmoid(np.matmul(x,W)+b)
    delta = y_hat - y
    cost = E_objective(x,y,W,b)
    dW = np.matmul(x.T,delta)/batchsize
    db = delta.mean()
    W = W-eps*dW
    b = b-eps*db
    return cost

def test(x,y):
    global W,b
    y_hat = sigmoid(np.matmul(x,W)+b)
    cost = E_objective(x,y,W,b)
    return cost,y_hat

4.5 結果

今回は一回の学習ごとに評価する。

E_evaluate_1=[]
E_evaluate_2=[]
E_evaluate_3=[]
y_evaluate_1=[]
y_evaluate_2=[]
y_evaluate_3=[]

num_epoch=2000

# W,bの初期化。一様分布、0で始める。
W = np.random.uniform(low=-0.08,high=0.08, size = (2,1)).astype('float32')
b = np.zeros(shape=(1,)).astype('float32')

for epoch in range(num_epoch):
    train(x_train,y_train,1.0)
    cost,y_hat = test(x_test,y_test)
    E_evaluate_1.append(cost)
    y_evaluate_1.append(y_hat)

# W,bの初期化。一様分布、0で始める。
W = np.random.uniform(low=-0.08,high=0.08, size = (2,1)).astype('float32')
b = np.zeros(shape=(1,)).astype('float32')

for epoch in range(num_epoch):
    train(x_train,y_train,0.1)
    cost,y_hat = test(x_test,y_test)
    E_evaluate_2.append(cost)
    y_evaluate_2.append(y_hat)

# W,bの初期化。一様分布、0で始める。
W = np.random.uniform(low=-0.08,high=0.08, size = (2,1)).astype('float32')
b = np.zeros(shape=(1,)).astype('float32')

for epoch in range(num_epoch):
    train(x_train,y_train,10)
    cost,y_hat = test(x_test,y_test)
    E_evaluate_3.append(cost)
    y_evaluate_3.append(y_hat)


plt.xlim([-5,1050])
plt.yscale("log")
plt.plot(E_evaluate_3,label="eps=10")
plt.plot(E_evaluate_1,label="eps=1.0")
plt.plot(E_evaluate_2,label="eps=0.1")
plt.title("目的関数の推移")
plt.legend(loc="upper right")
plt.show()

fig, axes = plt.subplots(1, 3, figsize=(12, 4))

axes[0].plot(np.array(y_evaluate_3)[:,0,0],label="[0,1]",)
axes[0].plot(np.array(y_evaluate_3)[:,1,0],label="[1,0]")
axes[0].plot(np.array(y_evaluate_3)[:,2,0],label="[0,0]")
axes[0].plot(np.array(y_evaluate_3)[:,3,0],label="[1,1]")
axes[0].set_title("eps=10")

axes[1].plot(np.array(y_evaluate_1)[:,0,0],label="[0,1]",)
axes[1].plot(np.array(y_evaluate_1)[:,1,0],label="[1,0]")
axes[1].plot(np.array(y_evaluate_1)[:,2,0],label="[0,0]")
axes[1].plot(np.array(y_evaluate_1)[:,3,0],label="[1,1]")
axes[1].set_title("eps=1")

axes[2].plot(np.array(y_evaluate_2)[:,0,0],label="[0,1]",)
axes[2].plot(np.array(y_evaluate_2)[:,1,0],label="[1,0]")
axes[2].plot(np.array(y_evaluate_2)[:,2,0],label="[0,0]")
axes[2].plot(np.array(y_evaluate_2)[:,3,0],label="[1,1]")
axes[2].legend(loc="right")
axes[2].set_title("eps=0.1")
plt.show()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up