LoginSignup
2
3

More than 3 years have passed since last update.

【R】二元配置分散分析

Last updated at Posted at 2021-01-07

Rで統計解析するための備忘録的なやつです。

使用するデータ

統計の時間

このサイトの例題を使ってみます。

まずはデータフレームを作ります。

Soil <- factor(c("A","A","A","A","A","A","A","A","A","A","A","A",
                 "B","B","B","B","B","B","B","B","B","B","B","B"))
Fertilizer <- factor(c("100","100","100","200","200","200","300","300","300","400","400","400",
                       "100","100","100","200","200","200","300","300","300","400","400","400"))
Yield <- c(14.5, 15.1, 14.1, 16.5, 16.1, 15, 17.8, 19, 15.2, 18.1, 20.2, 17.2,
           16.2, 15.3, 17.5, 18.6, 16.9, 18.6, 21.7, 20.5, 19.4, 23.6, 24.9, 25.5)
df <- data.frame(soil=Soil, fertilizer=Fertilizer, yield=Yield)

二元配置分散分析

aov()関数を使います。

aov()関数の書式は、aov(目的変数 ~ 説明変数, data = データフレーム名)です。

目的変数以外のデータフレームに含まれる残りの変数をすべて説明変数とするのであれば、
目的変数 ~.と表記することもできます。

繰り返しの無いデータの二元配置分散分析では、観測値に対する因子Aおよび因子Bの主効果の優位性を調べることができます。

一方、繰り返しのあるデータの二元配置分散分析では、因子Aおよび因子Bの各主効果の優位性に加えて、因子Aと因子Bとの交互作用についても調べることができます。

交互作用項を含めるときは因子A*因子Bを追加します。

今回の場合はsoil*fertilizerを加えます。

コードは以下のようになります。

summary()関数で分散分析表を出力します。

AOV <- aov(yield ~ soil + fertilizer +soil*fertilizer, data = df)
summary(AOV)

実行結果

> AOV <- aov(yield ~ soil + fertilizer +soil*fertilizer, data = df)

> summary(AOV)
                Df Sum Sq Mean Sq F value   Pr(>F)    
soil             1  66.33   66.33  46.333 4.21e-06 ***
fertilizer       3 126.64   42.21  29.485 9.40e-07 ***
soil:fertilizer  3  17.79    5.93   4.142   0.0237 *  
Residuals       16  22.91    1.43                     
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

最終的なスクリプト


# Rをきれいにする
rm(list = ls())

# データフレームの作成
Soil <- factor(c("A","A","A","A","A","A","A","A","A","A","A","A",
                 "B","B","B","B","B","B","B","B","B","B","B","B"))
Fertilizer <- factor(c("100","100","100","200","200","200","300","300","300","400","400","400",
                       "100","100","100","200","200","200","300","300","300","400","400","400"))
Yield <- c(14.5, 15.1, 14.1, 16.5, 16.1, 15, 17.8, 19, 15.2, 18.1, 20.2, 17.2,
           16.2, 15.3, 17.5, 18.6, 16.9, 18.6, 21.7, 20.5, 19.4, 23.6, 24.9, 25.5)
df <- data.frame(soil=Soil, fertilizer=Fertilizer, yield=Yield)

# 二元配置分散分析(aov()関数)
AOV <- aov(yield ~ soil + fertilizer +soil*fertilizer, data = df)
summary(AOV)

参考:二元配置分散分析(lm()関数、anova()関数)

Rではlm()関数とanova()関数を組み合わせても分散分析を実行できます。

LM <- lm(yield ~ soil + fertilizer +soil*fertilizer, data = df)
anova(LM)

実行結果

> LM <- lm(yield ~ soil + fertilizer +soil*fertilizer, data = df)

> anova(LM)
Analysis of Variance Table

Response: yield
                Df  Sum Sq Mean Sq F value    Pr(>F)    
soil             1  66.334  66.334 46.3332 4.214e-06 ***
fertilizer       3 126.638  42.213 29.4850 9.403e-07 ***
soil:fertilizer  3  17.791   5.930  4.1423   0.02373 *  
Residuals       16  22.907   1.432                      
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
2
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
3