More than 5 years have passed since last update.

移動平均法

Last updated at 2016-07-11Posted at 2016-07-07

　時系列データを分析する時、周期的なデータが存在することは、しばしばあります。この周期性を考慮しなければなりません。移動平均法をご紹介します。ここで、一番シンプルの移動平均法を説明します。

データ

y：目的変数
x：説明変数

            y          x
 [1,] 16.3420349  7.21799609
 [2,]  9.4853139 20.07930955
 [3,]  2.0946638  2.54631266
 [4,]  3.0991178  8.43088240
 [5,]  7.8069991 10.23466188
 [6,] 13.9609309  0.45914370
 [7,]  3.3191959  6.34949213
 [8,] 14.1245312 25.38715182
 [9,] 19.1783122 31.96572503
[10,]  2.6551769  3.69963485
[11,] 23.5168751 22.20547723
[12,]  6.0558261  9.21209711
[13,]  1.5733379  2.07579350
[14,]  1.1696452  0.57584675
[15,]  5.4701931  6.58309332
[16,] 17.7192605  4.28368676
[17,]  2.1629390  7.82622938
[18,]  3.9961220  2.56963947
[19,] -0.4549355  1.65004434
[20,] 17.1749453 37.00024382
[21,] 22.6155962 16.01943730
[22,] 10.6688664 13.77812541
[23,] -1.6582567  0.09505423
[24,]  4.6173663  3.14344014
[25,]  2.1885696  2.06269316

yを分解してみると、周期性があると分かりました。

移動平均法

上の図を御覧ください。データサイズは25で、5個のデータが1つの周期であることが分かります。
では、次のように、処理します。
$$my_{t-4}=\frac{1}{5}\sum_{t-4}^{t}y_i$$
同じく
$$mx_{t-4}=\frac{1}{5}\sum_{t-4}^{t}x_i$$

	y	my	x	mx
1	$$y_1$$	-	$$x_1$$	-
2	$$y_2$$	-	$$x_2$$	-
3	$$y_3$$	-	$$x_3$$	-
4	$$y_4$$	-	$$x_4$$	-
5	$$y_5$$	$$my_1$$	$$x_5$$	$$mx_1$$
6	$$y_6$$	$$my_2$$	$$x_6$$	$$mx_2$$
$$\vdots$$
24	$$y_{24}$$	$$my_{20}$$	$$x_{24}$$	$$mx_{20}$$
25	$$y_{25}$$	$$my_{21}$$	$$x_{25}$$	$$mx_{21}$$

$my$を目的変数と、$mx$を説明変数として、回帰分析をすればよいです。

下記の分析結果より、
$$my_t=3.5+0.5ma_t$$
となりますが、$y$と$x$の関係はどうでしょう。
平均値の性質によって、
$$y_t=c_t+0.5x_t$$
ここで、$c_t$は定数項です。しかも、周期性があるので、$c_t$の5個の平均値は一定で、3.5となるはずです。

Call:
lm(formula = y_ma ~ x_ma)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.9029 -0.2751 -0.0007  0.3610  0.7421 

Coefficients:
            Estimate Std. Error t value Pr(>|t|) 

(Intercept)  3.48831    0.27022   12.91 7.47e-11 ***
x_ma         0.50053    0.02392   20.92 1.40e-14 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4807 on 19 degrees of freedom
Multiple R-squared:  0.9584,    Adjusted R-squared:  0.9562 
F-statistic: 437.7 on 1 and 19 DF,  p-value: 1.401e-14

実は、このデータは発生した乱数です。
$$y_t=b_1x_{1,t}+b_2x_{2,t}+\varepsilon_t$$
数列$x_1=(50,1,2,3,4,50,1,2,3,4,\cdots)$
数列$x_2$は$\lambda=0.1$指数乱数列
$b_1,b_2$の設定値は$b_1=0.3,b_2=0.5$

code.r

t <- 25　　#行の数
n <- 1    #実行回数
b1 <- 0.3
b2 <- 0.5

X1 <- matrix( c(50,1,2,3,4), nrow = t, ncol = n )
X2 <- matrix( rexp(t*n,0.1), nrow = t, ncol = n )
error <- matrix( rnorm(t*n,0,5^0.5), nrow = t, ncol = n )
Y <- b1*X1 + b2*X2 + error

モンテカルロ法

1回の結果は偶然かもしれませんが、10000回で実行し、同じ結果であるかとチェックしてみましょう。
理論値の計算：
$b_1=0.3$
$b_2=0.5$
$c=(50+1+2+3+4)/5*b_1=3.6$

$c$の理論値	$c$の推定	$b_2$の理論	$b_2$の推定
3.6	3.6116	0.5	0.4994

> apply(mValue,2,mean)
[1] 3.6116440 0.4994208

理論値と推定は一致しています。

コード

code1.r

t <- 25　　#行の数
n <- 1    #実行回数
b1 <- 0.3
b2 <- 0.5

X1 <- matrix( c(50,1,2,3,4), nrow = t, ncol = n )
X2 <- matrix( rexp(t*n,0.1), nrow = t, ncol = n )
error <- matrix( rnorm(t*n,0,5^0.5), nrow = t, ncol = n )
Y <- b1*X1 + b2*X2 + error
#以上のように、乱数を発生する。
#=========================
y <- as.matrix(Y[,1])
x <- as.matrix(X2[,1])
cbind(y,x)
Ts <- ts(y,frequency = 5)
De <- decompose(Ts)
fname <- paste("Decompose", ".png", sep ="")
png(file = fname, height = 500, width = 800)
plot(De)
dev.off()
#yを分解し、季節性をチェックする。
#==========================
p <- 5
y_ma <- as.matrix(y[p:length(y),])
x_ma <- as.matrix(x[p:length(y),])
for (i in p:length(y)) {
    x_ma[i-p+1,] <- mean(x[(i-p+1):i,])
    y_ma[i-p+1,] <- mean(y[(i-p+1):i,])
}
summary( lm(y_ma ~ x_ma))
#移動平均を取り、回帰分析する。

モンテカルロ.r

t <- 25　　#行の数
n <- 10000    #実行回数
b1 <- 0.3
b2 <- 0.5

X1 <- matrix( c(50,1,2,3,4), nrow = t, ncol = n )
X2 <- matrix( rexp(t*n,0.1), nrow = t, ncol = n )
error <- matrix( rnorm(t*n,0,5^0.5), nrow = t, ncol = n )
Y <- b1*X1 + b2*X2 + error
# ======================================
p <- 5
Fun_ma <- function(x) {
  x_ma <- matrix(0,(length(x)-p+1),1)
  for (i in p:length(x)) {
    x_ma[i-p+1,] <- mean(x[(i-p+1):i])
  }
  x_ma
}
mx <- apply(X2,2,Fun_ma)
my <- apply(Y,2,Fun_ma)

mValue <- matrix(0,n,2)
for (i in 1:n){
  y <- my[,i]
  x <- mx[,i]
  mValue[i,] <- c(lm(y~x)$coefficients)
}
apply(mValue,2,mean)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up