0

More than 3 years have passed since last update.

@Tatsuki-Oike(Tatsuki Oike)

Rでk-means実装 (高速)

R

Last updated at 2020-11-15Posted at 2020-11-15

記事の目的

k-meansをRで実装します。
for文を極力使わず、実行速度を早くするように心がけました。
参考: ノンパラメトリックベイズ点過程と統計的機械学習の数理

目次

No.	目次
1	モデルの説明
2	データ
3	実装
4	確認

1. モデルの説明

2. データ

使用するデータはirisのデータセットです。

X <- iris[,1:4]
D <- ncol(X)
N <- nrow(X)

3. 実装

# (1)Kを求める
K <- 3
# (2)muを乱数で初期化
set.seed(100)
mu <- matrix(rep(apply(X,2,mean),each=K)+rnorm(K*D,0,1), nrow=K)
# (3)(ⅰ)(ⅱ)を繰り返す
max.iter <- 30
for(s in 1:max.iter){
  #(i)各クラスタの平均との距離が最小のクラスに分類
  tmp <- apply(mu, 1, function(x) apply(t(t(X)-x)^2, 1, sum))
  z <- apply(tmp, 1, which.min)
  #(ⅱ)各クラスタの平均を求める
  mu <-  apply(X, 2, function(x) tapply(x, z, mean))
}

4. 確認

左が正解で、右が実装後の結果です。

library(cluster)
par(mfrow=c(1,2))
clusplot(X, iris[,5], color=TRUE, shade=FALSE, labels=4, lines=0)
clusplot(X, z, color=TRUE, shade=FALSE, labels=4, lines=0)

0

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

0