特徴量スケーリングの意義

現在couseraの機械学習コースで学んでいて、特徴量スケーリングの話が出てきました。それによると特徴量スケーリングによって実行速度に違いが出るとのことなのでpythonで実装して実行時間を比較して見ます。

簡単にするために平面上で中心を目指すアルゴリズムを実装します。

初期値が同じ時に楕円のとき（疑似的なスケーリング前の関数）と真円のとき（擬似的なスケーリング後の関数）の実行時間を比べます。

import time

class gradient_discent:


    def __init__(self,alpha):
        self.alpha = alpha

    def gradient(self,x,alpha):
        h = alpha *x
        x = x - h * 2
        return x

    def ellipse(self,x1,x2):
        y = x1**2 + 100000*x2**2#楕円の関数
        return y

    def circle(self,x1,x2):
        y = x1**2 + x2**2#真円の関数
        return y


x1 = 1 #x1の初期値
x2 = 1 #x2の初期値

test_feature_scaring = gradient_discent(0.0001)
y = test_feature_scaring.ellipse(x1,x2)

time1 = time.time()
while y > 0.001:
    x1 = test_feature_scaring.gradient(x1,test_feature_scaring.alpha)
    x2 = test_feature_scaring.gradient(x2,test_feature_scaring.alpha)

    y = test_feature_scaring.ellipse(x1,x2)

time2 = time.time()

time3 = time2 - time1
print("スケーリング前の実行時間" + str(time3))

x1 = 1 #x1の初期値
x2 = 1 #x2の初期値

test_feature_scaring = gradient_discent(0.0001)
y = test_feature_scaring.circle(x1,x2)

time1 = time.time()
while y > 0.001:
    x1 = test_feature_scaring.gradient(x1,test_feature_scaring.alpha)
    x2 = test_feature_scaring.gradient(x2,test_feature_scaring.alpha)

    y = test_feature_scaring.circle(x1,x2)

time2 = time.time()

time3 = time2 - time1
print("スケーリング後の実行時間" + str(time3))

結果

スケーリング前の実行時間0.06296
スケーリング後の実行時間0.02598

感想

実際に約２．５倍実行時間に差がついたので特徴量スケーリングを体感できたかなと思う。

勾配降下法で特徴量スケーリングを理解するための簡単なプログラム

特徴量スケーリングの意義

結果

感想