More than 5 years have passed since last update.

大規模データでモデルの予測性能をチェックする関数

Last updated at 2012-12-26Posted at 2012-03-16

データが大規模になると、モデルの予測性能を調べるだけでもかなりの時間がかかってしまいます。そこでKsPlotの関数を使うと、データからサンプリングしながら性能をチェックし、プロットすることで、短時間で性能を可視化・評価することができます。以下のコードは100万サンプルのデータで予測性能をチェックしています。

KsPlot.R

library(KsPlot)

set.seed(1)
x1   <- rnorm(1000000)
set.seed(2)
x2   <- rnorm(1000000)
set.seed(3)
y    <- 2*x1 + x2**2 + rnorm(1000000)

X1      <- data.frame(x1 = x1, x2 = x2)
X2      <- data.frame(x1 = x1, x2 = x2, x3 = x2**2)
y       <- y

set.seed(1)
KsResult1 <- KsamplePlot(X1, y)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up