LoginSignup
1
0

More than 5 years have passed since last update.

WAA06

Posted at

データ解析基礎論 a weekly assignment A06

課題内容

WAA06.1

# データのinputと成形
dat <- read.csv("http://peach.l.chiba-u.ac.jp/course_folder/waa05.csv")
dat$condition = factor(dat$condition, levels(dat$condition)[2:1])
medicineA <- dat[dat$medicine == "Medicine A",]
# lm
medicineA.lm <- lm(blood.pressure ~ condition,data = medicineA)
summary(medicineA.lm)
# 出力結果
Call:
lm(formula = blood.pressure ~ condition, data = medicineA)

Residuals:
   Min     1Q Median     3Q    Max 
 -9.76  -3.88   0.18   4.12  11.12 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   130.8800     0.7014  186.61   <2e-16 ***
conditionpost -26.1200     0.9919  -26.33   <2e-16 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 4.959 on 98 degrees of freedom
Multiple R-squared:  0.8762,    Adjusted R-squared:  0.8749 
F-statistic: 693.5 on 1 and 98 DF,  p-value: < 2.2e-16

回帰分析の結果、薬Aの条件で投薬前後には有意な差があることがわかった。

WAA06.2

# データ成形
post <- dat[dat$condition == "post",]
# lm
dat.lm <- lm(blood.pressure ~ medicine,data = post)
# plot
plot(as.numeric(post$medicine)-1,post$blood.pressure,pch = 19,
     ylab = "blood pressure",xlab = "medicine type",xaxt = "n",
     xlim=c(-0.5,1.5))
axis(1,c(0,1),c("Medicine A","Medicine B"))
abline(dat.lm,col = "red")

RplotWAA06.1.png

summary(dat.lm)
# 出力結果
Call:
lm(formula = blood.pressure ~ medicine, data = post)

Residuals:
    Min      1Q  Median      3Q     Max 
-11.180  -4.760  -0.470   4.385  17.820 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        104.7600     0.8742  119.84   <2e-16 ***
medicineMedicine B  25.4200     1.2363   20.56   <2e-16 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 6.181 on 98 degrees of freedom
Multiple R-squared:  0.8118,    Adjusted R-squared:  0.8099 
F-statistic: 422.8 on 1 and 98 DF,  p-value: < 2.2e-16

回帰分析の結果、投薬後の薬Aと薬Bの間には有意な差がることが示された。

WAA06.3

dat<-read.table("http://www.matsuka.info/data_folder/tdkPATH01.txt",header=T)
plot(dat)

RplotWAA06.2.png

All
varianceAll.lm <- lm(grade ~ study + absence + knowledge + interest,dat)
summary(varianceAll.lm)
# 出力結果
Call:
lm(formula = grade ~ study + absence + knowledge + interest, 
    data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-17.900  -5.146  -0.587   6.524  13.202 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 54.94744    9.97293   5.510 9.83e-07 ***
study        0.08355    0.03667   2.279  0.02660 *  
absence     -0.58676    0.12810  -4.580 2.70e-05 ***
knowledge    0.36450    0.12650   2.882  0.00563 ** 
interest     0.86450    1.56025   0.554  0.58177    
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 7.463 on 55 degrees of freedom
Multiple R-squared:  0.7608,    Adjusted R-squared:  0.7434 
F-statistic: 43.74 on 4 and 55 DF,  p-value: < 2.2e-16
3変数
variance3.lm <- lm(grade ~ study + absence + knowledge,dat)
summary(variance3.lm)
# 出力結果

Call:
lm(formula = grade ~ study + absence + knowledge, data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-17.9720  -4.9393  -0.6443   6.6919  13.2094 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 55.70510    9.81742   5.674 5.12e-07 ***
study        0.09713    0.02711   3.583 0.000713 ***
absence     -0.61701    0.11516  -5.358 1.64e-06 ***
knowledge    0.39900    0.10943   3.646 0.000585 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 7.417 on 56 degrees of freedom
Multiple R-squared:  0.7595,    Adjusted R-squared:  0.7466 
F-statistic: 58.95 on 3 and 56 DF,  p-value: < 2.2e-16
2変数
variance2.lm <- lm(grade ~ absence + knowledge,dat)
summary(variance2.lm)
# 出力結果

Call:
lm(formula = grade ~ absence + knowledge, data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-17.8179  -5.2150   0.2177   4.5910  18.1371 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 76.89733    8.61051   8.931 2.00e-12 ***
absence     -0.88513    0.09619  -9.201 7.25e-13 ***
knowledge    0.29979    0.11634   2.577   0.0126 *  
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 8.151 on 57 degrees of freedom
Multiple R-squared:  0.7044,    Adjusted R-squared:  0.694 
F-statistic:  67.9 on 2 and 57 DF,  p-value: 8.246e-16
1変数
variance1.lm <- lm(grade ~ absence ,dat)
summary(variance1.lm)
# 出力結果
Call:
lm(formula = grade ~ absence, data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-17.5207  -6.5420  -0.0378   6.5188  19.6460 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 98.31673    2.35253   41.79  < 2e-16 ***
absence     -0.99020    0.09126  -10.85 1.38e-15 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 8.538 on 58 degrees of freedom
Multiple R-squared:  0.6699,    Adjusted R-squared:  0.6642 
F-statistic: 117.7 on 1 and 58 DF,  p-value: 1.384e-15

Adjusted R-squaredが一番高かったのは3変数の時であった。そのため、3変数モデルが一番適していると言える。

解答例

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0