More than 5 years have passed since last update.

scikit-learn、Spark.ml、TensorFlow で Perceptron〜（１）イントロダクション

Posted at 2017-05-07

線形回帰に引き続き、Perceptronによる識別を scikit-learn、Spark.ml、TensorFlow で行ってみました。

1. パーセプトロン

パーセプトロンは隠れ層のない、入力層と出力層のみのシンプルなニューラルネットです。scikit-learnには Perceptron単体のモデルがあったのでそちらを使いましたが、Spark.ml と TensorFlowでは、入力層と出力層のみのニューラルネットのモデルを作りました。
パーセプトロンは識別器（Classifier）なので、x,y が与えられたら AかB かというデータが必要になります。今回は、線形回帰の時とほぼ同じ形で、

ax + b >= y ならば 0
ax + b < y ならば 1

というデータを作ることにします。

コードの説明は略。今回は 0 の場合は青、1の場合は赤の散布図を描いています。

makeDataPC.py

# !/usr/bin/env python                                                                

import numpy as np
from numpy.random import rand

# data for calssifier                                                                
def makeDataPC(a, b, n=100, d=0.1, xs=0, xe=10):
    x = rand(n) * (xe - xs) + xs
    r = (rand(n) * 2 - 1) * d
    y = x * a + b + r
    r[r >= 0] = True  # tricky                                                       
    r[r  < 0] = False
    return x, y, r

import csv
def writeArrayWithCSV(dataFile, data):
    f = open(dataFile, 'w')
    writer = csv.writer(f, lineterminator='\n')
    writer.writerows(data)
    f.close()

import matplotlib.pyplot as plt
def plotXY(title, x, y, z):
    b = z.astype(np.bool)
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    bx = x[b]
    by = y[b]
    ax.scatter(bx, by, color='blue')
    r = -b
    rx = x[r]
    ry = y[r]
    ax.scatter(rx, ry, color='red')
    ax.set_title(title)
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    #fig.show()                                                                      
    imageFile = title + '.png'
    fig.savefig(imageFile)

# training data                                                                      
x,y,z = makeDataPC(0.4, 0.8, 100, 0.8)
title = 'trainPC'
plotXY(title, x, y, z)
dataFile = title + '.csv'
xyz = np.c_[x, y, z]
writeArrayWithCSV(dataFile, xyz)

booleanインデックスという [False, True, False] というような boolean配列を numpy.arrayのindexにすると、True の index 値の配列を返す機能を使っています。

>>> n = np.array([1,2,3,4])
>>> b = np.array([False,True,False,True])
>>> n[b]
array([2, 4])

boolean配列 b の逆は -b です。

>>> b = np.array([False,True,False,True])
>>> -b
array([ True, False,  True, False], dtype=bool)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up