numpy
機械学習
Python3
coursera

PythonでCousera Machine Learning Week2, 3の数式を実装する

More than 1 year has passed since last update.

Cousera Machine Learning Week2, 3で登場した数式をpythonで書き直してみます。
コイツ頭悪いなって箇所はご指摘いただけると嬉しいです。
パラメータの初期化やデータセットなどについては省いています。

Week2 線形回帰

コスト関数

仮説関数

h_\theta = \theta ^{T}x = \theta_0x_0+\theta_1x_1+...+\theta_mx_m

コスト関数

J(\theta)= \frac{1}{2m} \sum_{i=1}^{m} (h_\theta (x^{(i)})-y^{(i)})^2
import numpy as np

def hypothesis(X, theta):
    h = np.dot(X, theta)
    return h

# m = データ数, h = 仮説関数による予測値, y = 正解ラベル
def costFunction(m, h, y):
    J = np.sum(np.power(h - y, 2))/(2*m)
    return J

最急降下法

\theta_j : = \theta_j - \alpha \frac{1}{m}\sum_{i=1}^{m} ( h_\theta (x^{(i)})-y^{(i)} ) x_j^i
import numpy as np

def main():
    # コスト関数格納用配列
    J = [];
    # 最急降下法 alapha = 学習率
    theta, J = gradientDescent(m, theta, X, y, iter_num, alpha, J)

def hypothesis(X, theta):
    h = np.dot(X, theta)
    return h

def costFunction(m, h, y):
    J = np.sum(np.power(h - y, 2))/(2*m)
    return J

def gradientDescent(m, theta, X, y, iter_num, alpha, J):
    for i in range(iter_num):
        h = hypothesis(X, theta)
        J.append(costFunction(m, h, y))
        theta = theta - (alpha/m) * np.dot(X.T, h - y)
    return theta, J

Week3 ロジスティック回帰

シグモイド関数

σ(z)=\frac{1}{1+e^{(-z)}}
import numpy as np

def sigmoid(z):
    g = 1/(1 + np.exp(-z))
    return g

コスト関数

E(\theta)=-\frac{1}{m}(\sum_{i=1}^{m} y^i \log h_\theta (x^{(i)})+ (1-y^i) \log (1-h_\theta (x^{(i)})) )+ \frac{\lambda}{2m}\sum_{j=1}^{n}\theta_j^2

注) 正規化項に$\theta_0$は含まない

import numpy as np

def hypothesis(X, theta):
    h = np.dot(X, theta)
    return h

def sigmoid(z):
    return 1/(1 + np.exp(-z))

def costFunction(g, y, theta, m, lamda):
    J = -1/m * (np.sum((y*np.log(g) + (1-y)*np.log(1-g)))) + (lamda/2*m) * np.sum(theta[1:]*theta[1:])
    return J

最急降下法

$j = 0$のとき

\theta_0 : = \theta_0 - \alpha \frac{\partial E(\theta)}{\partial \theta_0}
\theta_0 : = \theta_0 - \alpha \frac{1}{m}\sum_{i=1}^{m} ( h_\theta (x^{(i)})-y^{(i)} ) x_j^i

$j \geq 1$のとき

\theta_j : = \theta_j - \alpha \frac{\partial E(\theta)}{\partial \theta_j}
\theta_j : = \theta_j - \alpha \frac{1}{m}\sum_{i=1}^{m} ( h_\theta (x^{(i)})-y^{(i)} ) x_j^i + \frac{\lambda}{m}\theta_j
import numpy as np

def hypothesis(X, theta):
    h = np.dot(X, theta)
    return h

def sigmoid(z):
    return 1/(1 + np.exp(-z))

def costFunction(g, y, theta, m, lamda):
    J = -1/m * (np.sum((y*np.log(g) + (1-y)*np.log(1-g)))) + (lamda/2*m) * np.sum(theta[1:]*theta[1:])
    return J

def updateTheta(theta , X, g, y, alpha, m, lamda):
    theta[0] = theta[0] - ((np.dot(X.T[0], g - y)) * alpha/m)
    theta[1:] = theta[1:] - ((np.dot(X.T[1:], g - y)) * alpha/m) + theta[1:] * lamda/m
    return theta

# 最急降下法 iter_num = 更新回数
def gradientDescent(theta, X, y, iter_num, alpha, lamda):
    m = len(X)
    J = []
    for j in range(iter_num):

        h = hypothesis(X, theta)
        g = sigmoid(h)

        J = J.append(costFunction(g, y, theta, m, lamda))
        theta = updateTheta(theta , X, g, y, alpha, m, lamda)

    return theta, J