More than 5 years have passed since last update.

k近傍法について

Last updated at 2018-08-08Posted at 2018-08-08

k近傍法とは

KNN(K-Nearest Neighbor)
クラス分類問題に使われる．
データの類似度が高いものを検出するのに使う．
対象データとの距離によりどちらのグループに分類できるか判定する．

解説

以下の図の例では緑の円（対象データ）が青い四角と赤い三角の二つのグループどちらに分類されるのか計算する．特徴が2つの場合は以下のように2次元となるが，k個の特徴を持つ場合にはk次元となる．もっとも近い値を持つクラスに分類するのが最近傍法(Nearest Neighbour)．
しかし，対象データはk=3とすると赤のグループに含まれるが，k=7として，点線内側を見ると青のグループのほうが数が多いのでこの場合，対象データは青四角のグループに含まれる方が正しい．これをk近傍法(k-Nearest Neighbour) という．

ちなみに，k=4とするとどちら赤，青それぞれ2つずつ含まれるので分類ができないためkは奇数の方が良い．

変形k近傍法(modified kNN)

k=4とした時，k近傍法では緑は赤，青どちらのグループに含まれるか判定できないが，
各データに対象データとの近さによって重みを与える方法を変形k近傍法という．
上の図の例の場合，対象データは赤のグループに含まれる．

sampleプログラム

k_nearest.py

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Feature set containing (x,y) values of 25 known/training data
trainData = np.random.randint(0,100,(25,2)).astype(np.float32)

# Labels each one either Red or Blue with numbers 0 and 1
responses = np.random.randint(0,2,(25,1)).astype(np.float32)

# Take Red families and plot them
red = trainData[responses.ravel()==0]
plt.scatter(red[:,0],red[:,1],80,'r','^')

# Take Blue families and plot them
blue = trainData[responses.ravel()==1]
plt.scatter(blue[:,0],blue[:,1],80,'b','s')


newcomer = np.random.randint(0,100,(1,2)).astype(np.float32)
plt.scatter(newcomer[:,0],newcomer[:,1],80,'g','o')

knn = cv2.ml.KNearest_create()
knn.train(trainData,cv2.ml.ROW_SAMPLE,responses)
ret, results, neighbours ,dist = knn.findNearest(newcomer, 3)

print ("result: ", results,"\n")
print ("neighbours: ", neighbours,"\n")
print ("distance: ", dist)

plt.show()

実行結果

結果のresult=0から赤いグループに属していることがわかります．
近傍は近いものから青，赤，赤になります．
距離は281,328,401となりました．
数値はランダムなので結果は毎回変わります．

result:  [[0.]] 

neighbours:  [[1. 0. 0.]] 

distance:  [[281. 328. 401.]]

参考文献

https://www.codexa.net/collaborative-filtering-k-nearest-neighbor/
http://labs.eecs.tottori-u.ac.jp/sd/Member/oyamada/OpenCV/html/py_tutorials/py_ml/py_knn/py_knn_understanding/py_knn_understanding.html

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up