データ解析基礎論 a weekly assignment A06
WAA06.1
# データのinputと成形
dat <- read.csv("http://peach.l.chiba-u.ac.jp/course_folder/waa05.csv")
dat$condition = factor(dat$condition, levels(dat$condition)[2:1])
medicineA <- dat[dat$medicine == "Medicine A",]
# lm
medicineA.lm <- lm(blood.pressure ~ condition,data = medicineA)
summary(medicineA.lm)
# 出力結果
Call:
lm(formula = blood.pressure ~ condition, data = medicineA)
Residuals:
Min 1Q Median 3Q Max
-9.76 -3.88 0.18 4.12 11.12
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 130.8800 0.7014 186.61 <2e-16 ***
conditionpost -26.1200 0.9919 -26.33 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.959 on 98 degrees of freedom
Multiple R-squared: 0.8762, Adjusted R-squared: 0.8749
F-statistic: 693.5 on 1 and 98 DF, p-value: < 2.2e-16
回帰分析の結果、薬Aの条件で投薬前後には有意な差があることがわかった。
WAA06.2
# データ成形
post <- dat[dat$condition == "post",]
# lm
dat.lm <- lm(blood.pressure ~ medicine,data = post)
# plot
plot(as.numeric(post$medicine)-1,post$blood.pressure,pch = 19,
ylab = "blood pressure",xlab = "medicine type",xaxt = "n",
xlim=c(-0.5,1.5))
axis(1,c(0,1),c("Medicine A","Medicine B"))
abline(dat.lm,col = "red")
summary(dat.lm)
# 出力結果
Call:
lm(formula = blood.pressure ~ medicine, data = post)
Residuals:
Min 1Q Median 3Q Max
-11.180 -4.760 -0.470 4.385 17.820
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 104.7600 0.8742 119.84 <2e-16 ***
medicineMedicine B 25.4200 1.2363 20.56 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.181 on 98 degrees of freedom
Multiple R-squared: 0.8118, Adjusted R-squared: 0.8099
F-statistic: 422.8 on 1 and 98 DF, p-value: < 2.2e-16
回帰分析の結果、投薬後の薬Aと薬Bの間には有意な差がることが示された。
WAA06.3
dat<-read.table("http://www.matsuka.info/data_folder/tdkPATH01.txt",header=T)
plot(dat)
All
varianceAll.lm <- lm(grade ~ study + absence + knowledge + interest,dat)
summary(varianceAll.lm)
# 出力結果
Call:
lm(formula = grade ~ study + absence + knowledge + interest,
data = dat)
Residuals:
Min 1Q Median 3Q Max
-17.900 -5.146 -0.587 6.524 13.202
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.94744 9.97293 5.510 9.83e-07 ***
study 0.08355 0.03667 2.279 0.02660 *
absence -0.58676 0.12810 -4.580 2.70e-05 ***
knowledge 0.36450 0.12650 2.882 0.00563 **
interest 0.86450 1.56025 0.554 0.58177
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.463 on 55 degrees of freedom
Multiple R-squared: 0.7608, Adjusted R-squared: 0.7434
F-statistic: 43.74 on 4 and 55 DF, p-value: < 2.2e-16
3変数
variance3.lm <- lm(grade ~ study + absence + knowledge,dat)
summary(variance3.lm)
# 出力結果
Call:
lm(formula = grade ~ study + absence + knowledge, data = dat)
Residuals:
Min 1Q Median 3Q Max
-17.9720 -4.9393 -0.6443 6.6919 13.2094
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 55.70510 9.81742 5.674 5.12e-07 ***
study 0.09713 0.02711 3.583 0.000713 ***
absence -0.61701 0.11516 -5.358 1.64e-06 ***
knowledge 0.39900 0.10943 3.646 0.000585 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.417 on 56 degrees of freedom
Multiple R-squared: 0.7595, Adjusted R-squared: 0.7466
F-statistic: 58.95 on 3 and 56 DF, p-value: < 2.2e-16
2変数
variance2.lm <- lm(grade ~ absence + knowledge,dat)
summary(variance2.lm)
# 出力結果
Call:
lm(formula = grade ~ absence + knowledge, data = dat)
Residuals:
Min 1Q Median 3Q Max
-17.8179 -5.2150 0.2177 4.5910 18.1371
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 76.89733 8.61051 8.931 2.00e-12 ***
absence -0.88513 0.09619 -9.201 7.25e-13 ***
knowledge 0.29979 0.11634 2.577 0.0126 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.151 on 57 degrees of freedom
Multiple R-squared: 0.7044, Adjusted R-squared: 0.694
F-statistic: 67.9 on 2 and 57 DF, p-value: 8.246e-16
1変数
variance1.lm <- lm(grade ~ absence ,dat)
summary(variance1.lm)
# 出力結果
Call:
lm(formula = grade ~ absence, data = dat)
Residuals:
Min 1Q Median 3Q Max
-17.5207 -6.5420 -0.0378 6.5188 19.6460
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 98.31673 2.35253 41.79 < 2e-16 ***
absence -0.99020 0.09126 -10.85 1.38e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.538 on 58 degrees of freedom
Multiple R-squared: 0.6699, Adjusted R-squared: 0.6642
F-statistic: 117.7 on 1 and 58 DF, p-value: 1.384e-15
Adjusted R-squaredが一番高かったのは3変数の時であった。そのため、3変数モデルが一番適していると言える。