2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

【 R言語 】大量の回帰式のmap一括実行と、解析結果(偏回帰係数とt値・p値)の格納列のmutateによる生成を%>%パイプで一筆書き

Last updated at Posted at 2020-12-31

__前回の記事__に続いて、R言語で、パイプ記法%>%を使って、以下を実行するコードを一筆書きしてみました。

  • 回帰式の左辺と右辺の組合せを配列結合で複数作り出す
  • 大量の回帰式を、__*map()*関数__を使って、データに対して一括適用する
  • 複数の回帰モデルを適用した結果を、格納する列を__*mutate()*関数__で定義する

__安井 翔太(著)『効果検証入門』(技術評論社)$pp.71-72$__の変数名を変えて、実行してみました。

なお、使用するデータセットは、次のGitHubリポジトリから取得しました。

RStudio
> library(remotes)
> remotes::install_github("itamarcaspi/experimentdatar")
Skipping install of 'experimentdatar' from a github remote, the SHA1 (f71a9d07) has not changed since last install.
  Use `force = TRUE` to force installation
> 
> library(experimentdatar)
> library(broom)
> library(tidyverse)
 Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 
 ggplot2 3.3.3      purrr   0.3.4
 tibble  3.0.4      dplyr   1.0.2
 tidyr   1.1.2      stringr 1.4.0
 readr   1.4.0      forcats 0.5.0
 Conflicts ──────────────────────────────────────── tidyverse_conflicts() 
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
>
RStudio
> data(vouchers) 
> 
> summary(vouchers)
       ID            BOG95SMP          BOG97SMP          JAM93SMP             SEX             AGE       
 Min.   :     1   Min.   :0.00000   Min.   :0.00000   Min.   :0.000000   Min.   :0.000   Min.   :10.0   
 1st Qu.:  6333   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.000   1st Qu.:14.0   
 Median : 12665   Median :0.00000   Median :0.00000   Median :0.000000   Median :1.000   Median :15.0   
 Mean   : 27670   Mean   :0.04643   Mean   :0.01094   Mean   :0.006514   Mean   :0.526   Mean   :14.7   
 3rd Qu.: 18997   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:1.000   3rd Qu.:16.0   
 Max.   :141226   Max.   :1.00000   Max.   :1.00000   Max.   :1.000000   Max.   :1.000   Max.   :20.0   
 NA's   :1                                                               NA's   :5403    NA's   :23392  
      AGE2          HSVISIT         SCYFNSH           INSCHL         PRSCH_C         PRSCHA_1        PRSCHA_2    
 Min.   :-2.00   Min.   :0.000   Min.   : 4.000   Min.   :0.000   Min.   :0.000   Min.   :0.00    Min.   :0.000  
 1st Qu.:12.00   1st Qu.:0.000   1st Qu.: 5.000   1st Qu.:1.000   1st Qu.:0.000   1st Qu.:1.00    1st Qu.:1.000  
 Median :13.00   Median :0.000   Median : 5.000   Median :1.000   Median :1.000   Median :1.00    Median :1.000  
 Mean   :13.14   Mean   :0.184   Mean   : 5.173   Mean   :0.861   Mean   :0.658   Mean   :0.89    Mean   :0.756  
 3rd Qu.:14.00   3rd Qu.:0.000   3rd Qu.: 5.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.00    3rd Qu.:1.000  
 Max.   :84.00   Max.   :1.000   Max.   :11.000   Max.   :1.000   Max.   :1.000   Max.   :1.00    Max.   :1.000  
 NA's   :5519    NA's   :23384                    NA's   :23399   NA's   :23407   NA's   :23409   NA's   :23409  
     VOUCH0          BOG95ASD          BOG97ASD          JAM93ASD          DBOGOTA          DJAMUNDI      
 Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000   Min.   :0.0000   Min.   :0.00000  
 1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000  
 Median :1.0000   Median :0.00000   Median :0.00000   Median :0.00000   Median :0.0000   Median :0.00000  
 Mean   :0.7178   Mean   :0.08879   Mean   :0.01804   Mean   :0.01101   Mean   :0.2296   Mean   :0.01354  
 3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000  
 Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000   Max.   :1.0000   Max.   :1.00000  
 NA's   :1                                                                                                
     D1995            D1997            RESPONSE        TEST_TAK        SEX_NAME          SVY            D1993        
 Min.   :0.0000   Min.   :0.00000   Min.   :0.000   Min.   :0.000   Min.   :0.00    Min.   :0.00    Min.   :0.00000  
 1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.00    1st Qu.:0.00    1st Qu.:0.00000  
 Median :0.0000   Median :0.00000   Median :1.000   Median :0.000   Median :0.00    Median :1.00    Median :0.00000  
 Mean   :0.1795   Mean   :0.06988   Mean   :0.541   Mean   :0.046   Mean   :0.49    Mean   :0.51    Mean   :0.04828  
 3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:1.000   3rd Qu.:0.000   3rd Qu.:1.00    3rd Qu.:1.00    3rd Qu.:0.00000  
 Max.   :1.0000   Max.   :1.00000   Max.   :1.000   Max.   :1.000   Max.   :1.00    Max.   :1.00    Max.   :1.00000  
                                    NA's   :22258   NA's   :19173   NA's   :20487   NA's   :23384                    
     PHONE            DAREA1              DAREA2              DAREA3             DAREA4             DAREA5       
 Min.   :0.0000   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.00e+00   Min.   :0.000000   Min.   :0.00000  
 1st Qu.:0.0000   1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.00e+00   1st Qu.:0.000000   1st Qu.:0.00000  
 Median :1.0000   Median :0.0000000   Median :0.0000000   Median :0.00e+00   Median :0.000000   Median :0.00000  
 Mean   :0.5659   Mean   :0.0007501   Mean   :0.0001579   Mean   :3.95e-05   Mean   :0.005448   Mean   :0.01958  
 3rd Qu.:1.0000   3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.00e+00   3rd Qu.:0.000000   3rd Qu.:0.00000  
 Max.   :1.0000   Max.   :1.0000000   Max.   :1.0000000   Max.   :1.00e+00   Max.   :1.000000   Max.   :1.00000  
                                                                                                                 
     DAREA6             DAREA7            DAREA8            DAREA9            DAREA10            DAREA11        
 Min.   :0.000000   Min.   :0.00000   Min.   :0.00000   Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
 1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.000000  
 Median :0.000000   Median :0.00000   Median :0.00000   Median :0.000000   Median :0.000000   Median :0.000000  
 Mean   :0.006317   Mean   :0.01176   Mean   :0.00379   Mean   :0.001027   Mean   :0.006948   Mean   :0.008685  
 3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.:0.000000  
 Max.   :1.000000   Max.   :1.00000   Max.   :1.00000   Max.   :1.000000   Max.   :1.000000   Max.   :1.000000  
                                                                                                                
    DAREA12            DAREA13             DAREA14             DAREA15            DAREA16            DAREA17       
 Min.   :0.000000   Min.   :0.0000000   Min.   :0.0000000   Min.   :0.000000   Min.   :0.000000   Min.   :0.00000  
 1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.00000  
 Median :0.000000   Median :0.0000000   Median :0.0000000   Median :0.000000   Median :0.000000   Median :0.00000  
 Mean   :0.002842   Mean   :0.0001184   Mean   :0.0002764   Mean   :0.003277   Mean   :0.001895   Mean   :0.01208  
 3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.:0.00000  
 Max.   :1.000000   Max.   :1.0000000   Max.   :1.0000000   Max.   :1.000000   Max.   :1.000000   Max.   :1.00000  
                                                                                                                   
    DAREA18            DAREA19           DMONTH1            DMONTH2            DMONTH3           DMONTH4       
 Min.   :0.000000   Min.   :0.00000   Min.   :0.000000   Min.   :0.000000   Min.   :0.00000   Min.   :0.00000  
 1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.00000  
 Median :0.000000   Median :0.00000   Median :0.000000   Median :0.000000   Median :0.00000   Median :0.00000  
 Mean   :0.003632   Mean   :0.03095   Mean   :0.001027   Mean   :0.001658   Mean   :0.02084   Mean   :0.01311  
 3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.00000  
 Max.   :1.000000   Max.   :1.00000   Max.   :1.000000   Max.   :1.000000   Max.   :1.00000   Max.   :1.00000  
                                                                                                               
    DMONTH5            DMONTH6            DMONTH7           DMONTH8            DMONTH9             DMONTH10        
 Min.   :0.000000   Min.   :0.000000   Min.   :0.00000   Min.   :0.000000   Min.   :0.0000000   Min.   :0.0000000  
 1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0.0000000  
 Median :0.000000   Median :0.000000   Median :0.00000   Median :0.000000   Median :0.0000000   Median :0.0000000  
 Mean   :0.008685   Mean   :0.006514   Mean   :0.01686   Mean   :0.006317   Mean   :0.0006317   Mean   :0.0003553  
 3rd Qu.:0.000000   3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000  
 Max.   :1.000000   Max.   :1.000000   Max.   :1.00000   Max.   :1.000000   Max.   :1.0000000   Max.   :1.0000000  
                                                                                                                   
    DMONTH11            DMONTH12             BOG95            BOG97            MOM_SCH          MOM_AGE     
 Min.   :0.0000000   Min.   :0.0000000   Min.   :0.0000   Min.   :0.00000   Min.   : 0.000   Min.   : 8.00  
 1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.: 5.000   1st Qu.:35.00  
 Median :0.0000000   Median :0.0000000   Median :0.0000   Median :0.00000   Median : 5.000   Median :39.00  
 Mean   :0.0004343   Mean   :0.0003948   Mean   :0.1597   Mean   :0.06988   Mean   : 5.894   Mean   :40.41  
 3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.: 8.000   3rd Qu.:45.00  
 Max.   :1.0000000   Max.   :1.0000000   Max.   :1.0000   Max.   :1.00000   Max.   :11.000   Max.   :97.00  
                                                                            NA's   :23607    NA's   :23517  
     MOM_MW         DAD_SCH          DAD_AGE          DAD_MW           SEX2          STRATA1           STRATA2       
 Min.   :0.00    Min.   : 0.000   Min.   : 1.00   Min.   :0.000   Min.   :0.000   Min.   :0.00000   Min.   :0.00000  
 1st Qu.:0.00    1st Qu.: 4.000   1st Qu.:38.00   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.00000   1st Qu.:0.00000  
 Median :0.00    Median : 5.000   Median :43.00   Median :0.000   Median :0.000   Median :0.00000   Median :0.00000  
 Mean   :0.02    Mean   : 5.856   Mean   :44.25   Mean   :0.098   Mean   :0.486   Mean   :0.01176   Mean   :0.04177  
 3rd Qu.:0.00    3rd Qu.: 8.000   3rd Qu.:49.00   3rd Qu.:0.000   3rd Qu.:1.000   3rd Qu.:0.00000   3rd Qu.:0.00000  
 Max.   :1.00    Max.   :11.000   Max.   :91.00   Max.   :1.000   Max.   :1.000   Max.   :1.00000   Max.   :1.00000  
 NA's   :23546   NA's   :23946    NA's   :23806   NA's   :23933   NA's   :20180                                      
    STRATA3           STRATA4             STRATA5           STRATA6            STRATAMS          REPT6      
 Min.   :0.00000   Min.   :0.0000000   Min.   :0.0e+00   Min.   :0.00e+00   Min.   :0.0000   Min.   :0.000  
 1st Qu.:0.00000   1st Qu.:0.0000000   1st Qu.:0.0e+00   1st Qu.:0.00e+00   1st Qu.:1.0000   1st Qu.:0.000  
 Median :0.00000   Median :0.0000000   Median :0.0e+00   Median :0.00e+00   Median :1.0000   Median :0.000  
 Mean   :0.01046   Mean   :0.0004343   Mean   :7.9e-05   Mean   :3.95e-05   Mean   :0.9355   Mean   :0.139  
 3rd Qu.:0.00000   3rd Qu.:0.0000000   3rd Qu.:0.0e+00   3rd Qu.:0.00e+00   3rd Qu.:1.0000   3rd Qu.:0.000  
 Max.   :1.00000   Max.   :1.0000000   Max.   :1.0e+00   Max.   :1.00e+00   Max.   :1.0000   Max.   :3.000  
                                                                                             NA's   :23411  
    TOTSCYRS        HASCHILD        MARRIED         WORKING           REPT           NREPT          FINISH6     
 Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
 1st Qu.:2.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000  
 Median :4.000   Median :0.000   Median :0.000   Median :0.000   Median :0.000   Median :0.000   Median :1.000  
 Mean   :3.368   Mean   :0.025   Mean   :0.012   Mean   :0.149   Mean   :0.164   Mean   :0.183   Mean   :0.936  
 3rd Qu.:4.000   3rd Qu.:0.000   3rd Qu.:0.000   3rd Qu.:0.000   3rd Qu.:0.000   3rd Qu.:0.000   3rd Qu.:1.000  
 Max.   :6.000   Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :3.000   Max.   :1.000  
 NA's   :23395   NA's   :23400   NA's   :23399   NA's   :23395   NA's   :23395   NA's   :23395   NA's   :23395  
    FINISH7         FINISH8         SEX_MISS        USNGSCH         HOURSUM          TAB3SMPL        WORKING3    
 Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   : 0.000   Min.   :1       Min.   :0.000  
 1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.: 0.000   1st Qu.:1       1st Qu.:0.000  
 Median :1.000   Median :1.000   Median :0.000   Median :0.000   Median : 0.000   Median :1       Median :0.000  
 Mean   :0.661   Mean   :0.522   Mean   :0.001   Mean   :0.351   Mean   : 3.626   Mean   :1       Mean   :0.134  
 3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:0.000   3rd Qu.:1.000   3rd Qu.: 0.000   3rd Qu.:1       3rd Qu.:0.000  
 Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :1.000   Max.   :40.000   Max.   :1       Max.   :1.000  
 NA's   :23395   NA's   :23395   NA's   :23395   NA's   :23395   NA's   :23395    NA's   :23753   NA's   :23395  
> 
RStudio
> head(vouchers)
# A tibble: 6 x 89
     ID BOG95SMP BOG97SMP JAM93SMP   SEX   AGE  AGE2 HSVISIT SCYFNSH INSCHL PRSCH_C PRSCHA_1 PRSCHA_2 VOUCH0 BOG95ASD
  <dbl>    <dbl>    <dbl>    <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>  <dbl>   <dbl>    <dbl>    <dbl>  <dbl>    <dbl>
1    NA        0        0        0    NA    NA    NA      NA       5     NA      NA       NA       NA     NA        0
2     1        0        0        0     1    NA    12      NA       5     NA      NA       NA       NA      0        1
3     2        0        0        0     0    NA    13      NA       5     NA      NA       NA       NA      0        1
4     3        1        0        0     0    14    12       0       8      1       1        1        1      1        1
5     4        1        0        0     1    14    12       0       8      1       1        1        1      0        1
6     5        1        0        0     0    14    12       0       8      1       0        1        0      0        1
# … with 74 more variables: BOG97ASD <dbl>, JAM93ASD <dbl>, DBOGOTA <dbl>, DJAMUNDI <dbl>, D1995 <dbl>, D1997 <dbl>,
#   RESPONSE <dbl>, TEST_TAK <dbl>, SEX_NAME <dbl>, SVY <dbl>, D1993 <dbl>, PHONE <dbl>, DAREA1 <dbl>, DAREA2 <dbl>,
#   DAREA3 <dbl>, DAREA4 <dbl>, DAREA5 <dbl>, DAREA6 <dbl>, DAREA7 <dbl>, DAREA8 <dbl>, DAREA9 <dbl>, DAREA10 <dbl>,
#   DAREA11 <dbl>, DAREA12 <dbl>, DAREA13 <dbl>, DAREA14 <dbl>, DAREA15 <dbl>, DAREA16 <dbl>, DAREA17 <dbl>,
#   DAREA18 <dbl>, DAREA19 <dbl>, DMONTH1 <dbl>, DMONTH2 <dbl>, DMONTH3 <dbl>, DMONTH4 <dbl>, DMONTH5 <dbl>,
#   DMONTH6 <dbl>, DMONTH7 <dbl>, DMONTH8 <dbl>, DMONTH9 <dbl>, DMONTH10 <dbl>, DMONTH11 <dbl>, DMONTH12 <dbl>,
#   BOG95 <dbl>, BOG97 <dbl>, MOM_SCH <dbl>, MOM_AGE <dbl>, MOM_MW <dbl>, DAD_SCH <dbl>, DAD_AGE <dbl>, DAD_MW <dbl>,
#   SEX2 <dbl>, STRATA1 <dbl>, STRATA2 <dbl>, STRATA3 <dbl>, STRATA4 <dbl>, STRATA5 <dbl>, STRATA6 <dbl>,
#   STRATAMS <dbl>, REPT6 <dbl>, TOTSCYRS <dbl>, HASCHILD <dbl>, MARRIED <dbl>, WORKING <dbl>, REPT <dbl>,
#   NREPT <dbl>, FINISH6 <dbl>, FINISH7 <dbl>, FINISH8 <dbl>, SEX_MISS <dbl>, USNGSCH <dbl>, HOURSUM <dbl>,
#   TAB3SMPL <dbl>, WORKING3 <dbl>
> 

####重回帰分析の右辺(複数の独立変数を線形に加算した式をリテラルで記述)

RStudio
> lm_independent_values_expression <- "SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3"

####独立変数の中で、最も基本(ベース)となる変数を定義

RStudio
> formula_x_base = "VOUCH0"

####従属変数(目的変数)のリストを定義。1つずつ値を取り出して、後に定義する重回帰式の左辺に定義します。

RStudio
> dependent_value_list <- c("TOTSCYRS", "INSCHL", "PRSCH_C", "USNGSCH", "PRSCHA_1", "FINISH6")
> 
> print(dependent_value_list)
[1] "TOTSCYRS" "INSCHL"   "PRSCH_C"  "USNGSCH"  "PRSCHA_1" "FINISH6" 
> 

####単回帰式を定義

右辺("~"の右側)には、従属変数(目的変数)リストから値を1つだけ取り出して配置します。

RStudio
> reg_formula_list <- paste(dependent_value_list,  "~",  formula_x_base)
> print(reg_formula_list)
[1] "TOTSCYRS ~ VOUCH0" "INSCHL ~ VOUCH0"   "PRSCH_C ~ VOUCH0"  "USNGSCH ~ VOUCH0"  "PRSCHA_1 ~ VOUCH0"
[6] "FINISH6 ~ VOUCH0" 
> 

####単回帰式に名前を付ける。

RStudio
> names(reg_formula_list) <- paste(dependent_value_list, "base", sep="_")
> print(reg_formula_list)
      TOTSCYRS_base         INSCHL_base        PRSCH_C_base        USNGSCH_base       PRSCHA_1_base 
"TOTSCYRS ~ VOUCH0"   "INSCHL ~ VOUCH0"  "PRSCH_C ~ VOUCH0"  "USNGSCH ~ VOUCH0" "PRSCHA_1 ~ VOUCH0" 
       FINISH6_base 
 "FINISH6 ~ VOUCH0" 
>

右辺("~"の右側)には、従属変数(目的変数)リストから値を1つだけ取り出して配置します。

####重回帰式を定義

RStudio
> all_reg_formula_list <- paste(dependent_value_list, "~", formula_x_base, "+", lm_independent_values_expression)
>
> print(all_reg_formula_list)
[1] "TOTSCYRS ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3"
[2] "INSCHL ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3"  
[3] "PRSCH_C ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
[4] "USNGSCH ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
[5] "PRSCHA_1 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3"
[6] "FINISH6 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
> 

####単回帰式と重回帰式(記述したリテラル値)をまとめる。

単回帰式にだけ、名前が付いています。

RStudio
> table_formula <- c(reg_formula_list, all_reg_formula_list)
> print(table_formula)
                                                          TOTSCYRS_base 
                                                    "TOTSCYRS ~ VOUCH0" 
                                                            INSCHL_base 
                                                      "INSCHL ~ VOUCH0" 
                                                           PRSCH_C_base 
                                                     "PRSCH_C ~ VOUCH0" 
                                                           USNGSCH_base 
                                                     "USNGSCH ~ VOUCH0" 
                                                          PRSCHA_1_base 
                                                    "PRSCHA_1 ~ VOUCH0" 
                                                           FINISH6_base 
                                                     "FINISH6 ~ VOUCH0" 
                                                                        
"TOTSCYRS ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
                                                                        
  "INSCHL ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
                                                                        
 "PRSCH_C ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
                                                                        
 "USNGSCH ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
                                                                        
"PRSCHA_1 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
                                                                        
 "FINISH6 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3" 
> 

####データ型をDataFrame型に変換

RStudio
> models <- table_formula %>% enframe(name = "model_index", value = "formula")
> print(models)
# A tibble: 12 x 2
   model_index     formula                                                              
   <chr>           <chr>                                                                
 1 "TOTSCYRS_base" TOTSCYRS ~ VOUCH0                                                    
 2 "INSCHL_base"   INSCHL ~ VOUCH0                                                      
 3 "PRSCH_C_base"  PRSCH_C ~ VOUCH0                                                     
 4 "USNGSCH_base"  USNGSCH ~ VOUCH0                                                     
 5 "PRSCHA_1_base" PRSCHA_1 ~ VOUCH0                                                    
 6 "FINISH6_base"  FINISH6 ~ VOUCH0                                                     
 7 ""              TOTSCYRS ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3
 8 ""              INSCHL ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3  
 9 ""              PRSCH_C ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3 
10 ""              USNGSCH ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3 
11 ""              PRSCHA_1 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3
12 ""              FINISH6 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3 
> 

####回帰式を適用する適用先のデータを定義

RStudio
> regression_data <- vouchers %>% filter(TAB3SMPL == 1, BOG95SMP == 1)
> head(regression_data)
# A tibble: 6 x 89
     ID BOG95SMP BOG97SMP JAM93SMP   SEX   AGE  AGE2 HSVISIT SCYFNSH INSCHL PRSCH_C PRSCHA_1 PRSCHA_2 VOUCH0 BOG95ASD
  <dbl>    <dbl>    <dbl>    <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>  <dbl>   <dbl>    <dbl>    <dbl>  <dbl>    <dbl>
1     3        1        0        0     0    14    12       0       8      1       1        1        1      1        1
2     4        1        0        0     1    14    12       0       8      1       1        1        1      0        1
3     5        1        0        0     0    14    12       0       8      1       0        1        0      0        1
4     6        1        0        0     0    12    10       0       7      1       0        1        1      0        1
5    10        1        0        0     1    14    11       0       8      1       0        0        0      1        1
6    11        1        0        0     0    14    12       0       8      1       0        0        0      1        1
# … with 74 more variables: BOG97ASD <dbl>, JAM93ASD <dbl>, DBOGOTA <dbl>, DJAMUNDI <dbl>, D1995 <dbl>, D1997 <dbl>,
#   RESPONSE <dbl>, TEST_TAK <dbl>, SEX_NAME <dbl>, SVY <dbl>, D1993 <dbl>, PHONE <dbl>, DAREA1 <dbl>, DAREA2 <dbl>,
#   DAREA3 <dbl>, DAREA4 <dbl>, DAREA5 <dbl>, DAREA6 <dbl>, DAREA7 <dbl>, DAREA8 <dbl>, DAREA9 <dbl>, DAREA10 <dbl>,
#   DAREA11 <dbl>, DAREA12 <dbl>, DAREA13 <dbl>, DAREA14 <dbl>, DAREA15 <dbl>, DAREA16 <dbl>, DAREA17 <dbl>,
#   DAREA18 <dbl>, DAREA19 <dbl>, DMONTH1 <dbl>, DMONTH2 <dbl>, DMONTH3 <dbl>, DMONTH4 <dbl>, DMONTH5 <dbl>,
#   DMONTH6 <dbl>, DMONTH7 <dbl>, DMONTH8 <dbl>, DMONTH9 <dbl>, DMONTH10 <dbl>, DMONTH11 <dbl>, DMONTH12 <dbl>,
#   BOG95 <dbl>, BOG97 <dbl>, MOM_SCH <dbl>, MOM_AGE <dbl>, MOM_MW <dbl>, DAD_SCH <dbl>, DAD_AGE <dbl>, DAD_MW <dbl>,
#   SEX2 <dbl>, STRATA1 <dbl>, STRATA2 <dbl>, STRATA3 <dbl>, STRATA4 <dbl>, STRATA5 <dbl>, STRATA6 <dbl>,
#   STRATAMS <dbl>, REPT6 <dbl>, TOTSCYRS <dbl>, HASCHILD <dbl>, MARRIED <dbl>, WORKING <dbl>, REPT <dbl>,
#   NREPT <dbl>, FINISH6 <dbl>, FINISH7 <dbl>, FINISH8 <dbl>, SEX_MISS <dbl>, USNGSCH <dbl>, HOURSUM <dbl>,
#   TAB3SMPL <dbl>, WORKING3 <dbl>
> 

####*map()*関数__を使って、データセットに対して、複数の回帰式を一括適用

-適用する値(map関数の第一引数):models$formulaに格納された__複数の回帰式
-適用する関数(map関数の第二引数):
*lm()*関数__。lm()には、models$formulaから回帰式を1つずつ取り出して与える。
-適用対象(map関数の第三引数):regression_data

RStudio
> df_models <- models %>% mutate(model = map(.x = formula, .f = lm, data = regression_data)) %>% mutate(lm_result = map(.x = model, .f = tidy))
> 

####__*map()*関数の実行結果__を表示。

RStudio
> print(df_models)
# A tibble: 12 x 4
   model_index     formula                                                               model  lm_result       
   <chr>           <chr>                                                                 <list> <list>          
 1 "TOTSCYRS_base" TOTSCYRS ~ VOUCH0                                                     <lm>   <tibble [2 × 5]>
 2 "INSCHL_base"   INSCHL ~ VOUCH0                                                       <lm>   <tibble [2 × 5]>
 3 "PRSCH_C_base"  PRSCH_C ~ VOUCH0                                                      <lm>   <tibble [2 × 5]>
 4 "USNGSCH_base"  USNGSCH ~ VOUCH0                                                      <lm>   <tibble [2 × 5]>
 5 "PRSCHA_1_base" PRSCHA_1 ~ VOUCH0                                                     <lm>   <tibble [2 × 5]>
 6 "FINISH6_base"  FINISH6 ~ VOUCH0                                                      <lm>   <tibble [2 × 5]>
 7 ""              TOTSCYRS ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3 <lm>   <tibble [8 × 5]>
 8 ""              INSCHL ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3   <lm>   <tibble [8 × 5]>
 9 ""              PRSCH_C ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3  <lm>   <tibble [8 × 5]>
10 ""              USNGSCH ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3  <lm>   <tibble [8 × 5]>
11 ""              PRSCHA_1 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3 <lm>   <tibble [8 × 5]>
12 ""              FINISH6 ~ VOUCH0 + SVY + HSVISIT + AGE + STRATA1 + STRATA2 + STRATA3  <lm>   <tibble [8 × 5]>
> 
RStudio
> print(df_models$model)
[[1]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
    3.65302      0.05809  


[[2]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
    0.83096      0.01861  


[[3]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
     0.5391       0.1600  


[[4]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
    0.05694      0.50887  


[[5]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
    0.87722      0.06295  


[[6]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0  
    0.94306      0.02617  


[[7]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
    6.22680      0.04594      0.05080      0.05236     -0.17671     -0.07833      0.10382      0.14800  


[[8]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
   2.262204     0.011583    -0.007225     0.051464    -0.098327    -0.003393     0.065811     0.062946  


[[9]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
    1.70532      0.15482      0.01149      0.03878     -0.08108     -0.06651      0.08655      0.06113  


[[10]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
    0.74951      0.50564     -0.04750      0.02388     -0.05027      0.06116      0.10749      0.01893  


[[11]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
    1.17091      0.06190     -0.01092      0.05269     -0.01795     -0.02522     -0.02327     -0.05940  


[[12]]

Call:
.f(formula = .x[[i]], data = ..1)

Coefficients:
(Intercept)       VOUCH0          SVY      HSVISIT          AGE      STRATA1      STRATA2      STRATA3  
  1.2825853    0.0246813    0.0062274    0.0220922   -0.0223925   -0.0293528   -0.0026688   -0.0001548  


> 
RStudio
> print(df_models$lm_result)
[[1]]
# A tibble: 2 x 5
  term        estimate std.error statistic p.value
  <chr>          <dbl>     <dbl>     <dbl>   <dbl>
1 (Intercept)   3.65      0.0374     97.7    0    
2 VOUCH0        0.0581    0.0524      1.11   0.267

[[2]]
# A tibble: 2 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)   0.831     0.0155    53.8   1.64e-315
2 VOUCH0        0.0186    0.0216     0.860 3.90e-  1

[[3]]
# A tibble: 2 x 5
  term        estimate std.error statistic   p.value
  <chr>          <dbl>     <dbl>     <dbl>     <dbl>
1 (Intercept)    0.539    0.0202     26.7  2.21e-122
2 VOUCH0         0.160    0.0283      5.66 1.96e-  8

[[4]]
# A tibble: 2 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   0.0569    0.0164      3.46 5.52e- 4
2 VOUCH0        0.509     0.0230     22.1  1.80e-90

[[5]]
# A tibble: 2 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   0.877     0.0120     72.8  0       
2 VOUCH0        0.0629    0.0169      3.73 0.000200

[[6]]
# A tibble: 2 x 5
  term        estimate std.error statistic p.value
  <chr>          <dbl>     <dbl>     <dbl>   <dbl>
1 (Intercept)   0.943    0.00860    110.    0     
2 VOUCH0        0.0262   0.0120       2.17  0.0300

[[7]]
# A tibble: 8 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   6.23      0.298     20.9   2.12e-82
2 VOUCH0        0.0459    0.0504     0.912 3.62e- 1
3 SVY           0.0508    0.0607     0.837 4.03e- 1
4 HSVISIT       0.0524    0.111      0.470 6.38e- 1
5 AGE          -0.177     0.0191    -9.26  9.79e-20
6 STRATA1      -0.0783    0.0911    -0.860 3.90e- 1
7 STRATA2       0.104     0.0715     1.45  1.46e- 1
8 STRATA3       0.148     0.0941     1.57  1.16e- 1

[[8]]
# A tibble: 8 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  2.26      0.119     19.0    2.27e-70
2 VOUCH0       0.0116    0.0201     0.576  5.64e- 1
3 SVY         -0.00722   0.0242    -0.298  7.66e- 1
4 HSVISIT      0.0515    0.0444     1.16   2.47e- 1
5 AGE         -0.0983    0.00761  -12.9    1.03e-35
6 STRATA1     -0.00339   0.0363    -0.0934 9.26e- 1
7 STRATA2      0.0658    0.0285     2.31   2.11e- 2
8 STRATA3      0.0629    0.0376     1.68   9.40e- 2

[[9]]
# A tibble: 8 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   1.71      0.162     10.5   8.49e-25
2 VOUCH0        0.155     0.0274     5.65  2.00e- 8
3 SVY           0.0115    0.0330     0.348 7.28e- 1
4 HSVISIT       0.0388    0.0605     0.641 5.22e- 1
5 AGE          -0.0811    0.0104    -7.81  1.25e-14
6 STRATA1      -0.0665    0.0495    -1.34  1.79e- 1
7 STRATA2       0.0866    0.0389     2.23  2.61e- 2
8 STRATA3       0.0611    0.0512     1.19  2.33e- 1

[[10]]
# A tibble: 8 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   0.750    0.133       5.63  2.30e- 8
2 VOUCH0        0.506    0.0225     22.5   1.03e-92
3 SVY          -0.0475   0.0271     -1.75  8.05e- 2
4 HSVISIT       0.0239   0.0498      0.480 6.32e- 1
5 AGE          -0.0503   0.00853    -5.89  5.04e- 9
6 STRATA1       0.0612   0.0407      1.50  1.33e- 1
7 STRATA2       0.107    0.0319      3.36  7.93e- 4
8 STRATA3       0.0189   0.0421      0.450 6.53e- 1

[[11]]
# A tibble: 8 x 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)   1.17     0.0996     11.8   3.21e-30
2 VOUCH0        0.0619   0.0168      3.68  2.47e- 4
3 SVY          -0.0109   0.0203     -0.538 5.91e- 1
4 HSVISIT       0.0527   0.0372      1.42  1.57e- 1
5 AGE          -0.0180   0.00638    -2.82  4.96e- 3
6 STRATA1      -0.0252   0.0304     -0.829 4.08e- 1
7 STRATA2      -0.0233   0.0239     -0.974 3.30e- 1
8 STRATA3      -0.0594   0.0315     -1.89  5.93e- 2

[[12]]
# A tibble: 8 x 5
  term         estimate std.error statistic  p.value
  <chr>           <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  1.28       0.0706   18.2     6.11e-65
2 VOUCH0       0.0247     0.0119    2.07    3.90e- 2
3 SVY          0.00623    0.0144    0.433   6.65e- 1
4 HSVISIT      0.0221     0.0264    0.837   4.03e- 1
5 AGE         -0.0224     0.00452  -4.95    8.54e- 7
6 STRATA1     -0.0294     0.0216   -1.36    1.74e- 1
7 STRATA2     -0.00267    0.0169   -0.158   8.75e- 1
8 STRATA3     -0.000155   0.0223   -0.00694 9.94e- 1
2
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?