高校数学から学ぶ活性化関数とニューラルネットワーク

Posted at 2025-08-12

1. 高校数学の関数の基本

1.1 一次関数（Linear function）

式：
$y = ax + b$
特徴：
- グラフは直線
- $a$：傾き（slope）
- $b$：切片（intercept）
AIでの利用例：単純な重み付け和の計算に相当

1.2 放物線（二次関数, Quadratic function）

式：
$y = ax^2 + bx + c$
特徴：
- グラフはU字型（$a > 0$）または逆U字型（$a < 0$）
- 頂点：$\left( -\frac{b}{2a}, f\left(-\frac{b}{2a}\right) \right)$
AIでの利用例：二乗誤差（MSE）や曲線近似

1.3 ルート関数（Square root function）

式：
$y = \sqrt{x}$
特徴：
- 定義域：$x \geq 0$
- 成長が次第に緩やかになる非線形性
AIでの利用例：Bent Identity関数の一部形状に類似

1.4 指数関数（Exponential function）

式：
$y = a^x$
特徴：
- 急増または急減する
- 常用対数や自然対数の逆関数
AIでの利用例：Sigmoid関数やSoftmaxの計算に利用

1.5 対数関数（Logarithmic function）

式：
$y = \log_a x$
特徴：
- 定義域：$x > 0$
- 増加は緩やか
AIでの利用例：Softplus関数に利用

1.6 三角関数（Trigonometric functions）

式：
$y = \sin x, \cos x$
特徴：
- 周期性
- 振動・波動を表現
AIでの利用例：SIREN（Sinusoidal Representation Networks）

2. ニューラルネットワークの概要

順伝播（Forward propagation）：入力データが各層を通過し、出力を得る計算過程
逆伝播（Backpropagation）：出力誤差を基に勾配を計算し、重みを更新する過程
活性化関数（Activation function）：層の出力に非線形性を与え、学習の表現力を向上

3. 代表的な活性化関数と高校数学対応

高校関数	活性化関数	式
一次関数	ReLU（Rectified Linear Unit）	$f(x) = \max(0, x)$
放物線	HardSwish	$f(x) = x \cdot \frac{\text{ReLU6}(x+3)}{6}$
ルート	Bent Identity	$f(x) = \frac{\sqrt{x^2+1} - 1}{2} + x$
指数	Sigmoid	$f(x) = \frac{1}{1 + e^{-x}}$
指数	Softmax	$\text{softmax}(z_i) = \frac{e^{z_i}}{\sum_j e^{z_j}}$
対数	Softplus	$f(x) = \log(1 + e^x)$
三角	Snake	$f(x) = x + \frac{1}{\alpha} \sin^2(\alpha x)$
三角	SIREN	$f(x) = \sin(\omega_0 x)$

4. 順伝播と逆伝播の数式例（1層＋ReLU）

順伝播（Forward Propagation）

入力 $x \in \mathbb{R}^n$、重み $W \in \mathbb{R}^{m \times n}$、バイアス $b \in \mathbb{R}^m$
線形変換：

$$
z = W x + b
$$
活性化（ReLU）：

$$
a = \max(0, z)
$$
出力 $a$ を次の層に渡す

逆伝播（Backpropagation）

損失関数 $L$ に対する勾配計算：
1. 活性化の勾配（ReLU）：
  
  $$
  \frac{\partial L}{\partial z_i} =
  \begin{cases}
  \frac{\partial L}{\partial a_i} & z_i > 0 \
  0 & z_i \le 0
  \end{cases}
  $$
2. 重みの勾配：
  
  $$
  \frac{\partial L}{\partial W} = \frac{\partial L}{\partial z} \cdot x^T
  $$
3. バイアスの勾配：
  
  $$
  \frac{\partial L}{\partial b} = \frac{\partial L}{\partial z}
  $$
4. 入力側への勾配：
  
  $$
  \frac{\partial L}{\partial x} = W^T \cdot \frac{\partial L}{\partial z}
  $$

# -*- coding: utf-8 -*-
# Activation plots + "backprop" (derivative) plots / 活性化関数とその導関数を可視化
# 依存 / deps
import numpy as np
import matplotlib.pyplot as plt

# ----------------------------
# 1) Scalar activations & derivatives / スカラー活性化と導関数
# ----------------------------
def relu(x):
    return np.maximum(0.0, x)
def drelu(x):
    return (x > 0).astype(float)

def hswish(x):
    return x * np.clip(x + 3.0, 0.0, 6.0) / 6.0
def dhswish(x):
    y = np.zeros_like(x, dtype=float)
    y[x <= -3.0] = 0.0
    y[x >= 3.0]  = 1.0
    m = (x > -3.0) & (x < 3.0)
    y[m] = (2.0*x[m] + 3.0) / 6.0
    return y

def bent_identity(x):
    return x + (np.sqrt(x**2 + 1.0) - 1.0) / 2.0
def dbent_identity(x):
    return 1.0 + x / (2.0*np.sqrt(x**2 + 1.0))

def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))
def dsigmoid(x):
    s = sigmoid(x)
    return s*(1.0 - s)

def softplus(x):
    # stable: max(0,x)+log1p(exp(-|x|))
    return np.maximum(0.0, x) + np.log1p(np.exp(-np.abs(x)))
def dsoftplus(x):
    return sigmoid(x)  # d/dx softplus = sigmoid

def snake(x, alpha=1.0):
    return x + (1.0/alpha)*np.sin(alpha*x)**2
def dsnake(x, alpha=1.0):
    return 1.0 + np.sin(2.0*alpha*x)  # 1 + 2 sin cos

def siren(x, w0=1.0):
    return np.sin(w0*x)
def dsiren(x, w0=1.0):
    return w0*np.cos(w0*x)

# ----------------------------
# 2) Softmax (vector) & Jacobian diag plot / ソフトマックスとヤコビアン対角の可視化
# ----------------------------
def softmax(z, axis=-1):
    z = z - np.max(z, axis=axis, keepdims=True)  # stable
    e = np.exp(z)
    return e / np.sum(e, axis=axis, keepdims=True)

def softmax_diag_derivative(z):
    """
    Return diag(J) where J_ij = ∂s_i/∂z_j.
    s_i(1-s_i) on diagonal. / 対角成分のみ
    """
    s = softmax(z, axis=-1)
    return s*(1.0 - s)

# ----------------------------
# 3) Plot settings / 描画設定
# ----------------------------
xs = np.linspace(-6.0, 6.0, 800)

# --- ReLU ---
plt.figure()
plt.plot(xs, relu(xs), label='ReLU f(x)')
plt.plot(xs, drelu(xs), label="ReLU' (x)")
plt.title('ReLU: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- HardSwish ---
plt.figure()
plt.plot(xs, hswish(xs), label='HardSwish f(x)')
plt.plot(xs, dhswish(xs), label="HardSwish' (x)")
plt.title('HardSwish: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- Bent Identity ---
plt.figure()
plt.plot(xs, bent_identity(xs), label='BentIdentity f(x)')
plt.plot(xs, dbent_identity(xs), label="BentIdentity\' (x)")
plt.title('Bent Identity: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- Sigmoid & Softplus ---
plt.figure()
plt.plot(xs, sigmoid(xs), label='Sigmoid f(x)')
plt.plot(xs, dsigmoid(xs), label="Sigmoid' (x)")
plt.title('Sigmoid: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

plt.figure()
plt.plot(xs, softplus(xs), label='Softplus f(x)')
plt.plot(xs, dsoftplus(xs), label="Softplus' (x) = Sigmoid")
plt.title('Softplus: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- Snake (alpha=1) ---
alpha = 1.0
plt.figure()
plt.plot(xs, snake(xs, alpha), label=f'Snake f(x), α={alpha}')
plt.plot(xs, dsnake(xs, alpha), label=f"Snake' (x) = 1 + sin(2αx)")
plt.title('Snake: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- SIREN (w0=2) ---
w0 = 2.0
plt.figure()
plt.plot(xs, siren(xs, w0), label=f'SIREN sin(w0 x), w0={w0}')
plt.plot(xs, dsiren(xs, w0), label=f"SIREN' = w0 cos(w0 x)")
plt.title('SIREN: activation & derivative')
plt.xlabel('x'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# --- Softmax path: z=[t,0,0] ---
ts = np.linspace(-6.0, 6.0, 400)
Z = np.stack([ts, np.zeros_like(ts), np.zeros_like(ts)], axis=1)  # (T,3)
S = softmax(Z, axis=1)     # (T,3)
Ddiag = softmax_diag_derivative(Z)  # (T,3)

plt.figure()
plt.plot(ts, S[:,0], label='softmax_1(t,0,0)')
plt.plot(ts, S[:,1], label='softmax_2(t,0,0)')
plt.plot(ts, S[:,2], label='softmax_3(t,0,0)')
plt.title('Softmax components along z=[t,0,0]')
plt.xlabel('t'); plt.ylabel('probability'); plt.legend(); plt.grid(True); plt.tight_layout()

plt.figure()
plt.plot(ts, Ddiag[:,0], label='∂s1/∂z1 = s1(1-s1)')
plt.plot(ts, Ddiag[:,1], label='∂s2/∂z2 = s2(1-s2)')
plt.plot(ts, Ddiag[:,2], label='∂s3/∂z3 = s3(1-s3)')
plt.title('Softmax diagonal derivatives (Jacobian diag)')
plt.xlabel('t'); plt.ylabel('value'); plt.legend(); plt.grid(True); plt.tight_layout()

# ----------------------------
# 4) Tiny forward-backward demo (1-layer + ReLU) / 順・逆伝播デモ
# ----------------------------
def forward_relu_layer(x, W, b):
    z = x @ W + b
    a = relu(z)
    cache = (x, W, b, z, a)
    return a, cache

def mse_loss(a, y):
    return 0.5*np.mean((a - y)**2)

def backward_relu_layer(dL_da, cache):
    x, W, b, z, a = cache
    dL_dz = dL_da * drelu(z)         # ⊙ ReLU'(z)
    dW = x.T @ dL_dz
    db = dL_dz.sum(axis=0)
    dx = dL_dz @ W.T
    return dW, db, dx, dL_dz

# data / デモ用データ
np.random.seed(42)
N, D, M = 64, 3, 2
x = np.random.randn(N, D)
true_W = np.array([[1.0, -0.5],[0.3, 0.8],[-0.2, 0.1]])
true_b = np.array([0.2, -0.1])
y = relu(x @ true_W + true_b) + 0.05*np.random.randn(N, M)  # 教師（ノイズ付き）

# init params / パラメータ初期化
W = 0.5*np.random.randn(D, M)
b = np.zeros(M)
lr = 0.1

loss_hist = []
for it in range(50):
    a, cache = forward_relu_layer(x, W, b)
    loss = mse_loss(a, y)
    loss_hist.append(loss)
    dL_da = (a - y) / N
    dW, db, dx, dL_dz = backward_relu_layer(dL_da, cache)
    W -= lr*dW
    b -= lr*db

# plot loss / 損失の推移
plt.figure()
plt.plot(loss_hist, label='MSE loss')
plt.title('Training curve (1-layer ReLU)')
plt.xlabel('iteration'); plt.ylabel('loss'); plt.grid(True); plt.legend(); plt.tight_layout()

plt.show()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up