R言語 Advent Calendar 2025

John Fox さんを偲んで

Last updated at 2025-12-09Posted at 2025-12-09

大変残念なことですが、初期のRの普及に大きく貢献されたJohn Fox さんが亡くなられたそうです。

この記事では John Fox さんが作られたR言語のパッケージの紹介を通じて追悼したいと思います。

car

carパッケージは"An R Companion to Applied Regression" という書籍のために作られたパッケージで、回帰分析を補助する様々な機能が実装されています。

特に多重共線性のチェックをするVIF（Variance Inflation Factor、分散拡大係数）を算出するvif()にはお世話になりました。一般にVIFが10以上の説明変数は深刻な多重共線性がある可能性があります。

mtcarsデータセットで試してみましょう。

自動車の燃費（mpg）を排気量（disp）、シリンダー数（cyl）、馬力（hp）、車重（wt）で回帰します。

library(conflicted)
library(tidyverse)
library(car)

data(mtcars)

model <- mtcars |>
  lm(mpg ~ cyl + disp + hp + wt, data = _)

# carパッケージのVIF関数で多重共線性を診断
vif(model)
#       cyl      disp        hp        wt 
#  6.737707 10.373286  3.405983  4.848016

この例では排気量（disp）に多重共線性の問題がありそうです。

Rcmdr

RcmdrパッケージははR言語の基本的な統計解析機能をGUIで操作できるようにするパッケージです。

library(Rcmdr)

このようにメニューから選ぶだけで統計量の算出、可視化、回帰モデルの構築、因子分析、クラスター分析などが実行できます。
特にGUIベースの統計解析ソフトからRに移行するユーザーには大きな助けになりました。
私も元々SPSSユーザーでしたので、Rの基礎はRcmdrで学びました。

sem

semは構造方程式モデリング（Structural Equation Modeling, SEM）のためのパッケージ。
現在ではlavaanパッケージの方が人気ですが、RでSEMが実行できるようになったのには感動しました。

library(sem)
R.DHP <- readMoments(
  diag=FALSE,
  names=c("ROccAsp", "REdAsp", "FOccAsp", "FEdAsp", "RParAsp", "RIQ", "RSES", "FSES", "FIQ", "FParAsp"),
  text="
    .6247     
    .3269  .3669       
    .4216  .3275  .6404
    .2137  .2742  .1124  .0839
    .4105  .4043  .2903  .2598  .1839
    .3240  .4047  .3054  .2786  .0489  .2220
    .2930  .2407  .4105  .3607  .0186  .1861  .2707
    .2995  .2863  .5191  .5007  .0782  .3355  .2302  .2950
    .0760  .0702  .2784  .1988  .1147  .1021  .0931 -.0438  .2087
  ")

model.dhp.1 <- specifyEquations(
  covs="RGenAsp, FGenAsp",
  text="
    RGenAsp = gam11*RParAsp + gam12*RIQ + gam13*RSES + gam14*FSES + beta12*FGenAsp
    FGenAsp = gam23*RSES + gam24*FSES + gam25*FIQ + gam26*FParAsp + beta21*RGenAsp
    ROccAsp = 1*RGenAsp
    REdAsp = lam21(1)*RGenAsp  # to illustrate setting start values
    FOccAsp = 1*FGenAsp
    FEdAsp = lam42(1)*FGenAsp
  ")

sem.dhp.1 <- sem(
  model = model.dhp.1,
  S = R.DHP,
  N = 329,
  fixed.x=c('RParAsp', 'RIQ', 'RSES', 'FSES', 'FIQ', 'FParAsp')
  )

summary(sem.dhp.1)

library(semPlot)

sem.dhp.1 |>
  semPaths("std")

polycor

polycorは順序尺度の相関係数であるポリコリック相関係数やポリシリアル相関係数を算出するためのパッケージです。

特にカテゴリカル因子分析やSEMに順序尺度を含める際に使いました。

library(polycor)
data.frame(
  w = c(6, 7, 2, 5, 6, 3, 3, 4, 1, 2),
  x = c(1, 2, 3, 6, 1, 4, 3, 4, 7, 2),
  y = as.factor(c(3, 3, 1, 1, 2, 2, 2, 1, 2, 2)),
  z = as.factor(c(2, 3, 2, 1, 1, 2, 1, 1, 3, 2))
  ) |>
  hetcor()
# Two-Step Estimates
# 
# Correlations/Type of Correlation:
#         w       x          y          z
# w       1 Pearson Polyserial Polyserial
# x -0.5124       1 Polyserial Polyserial
# y  0.4963 -0.5661          1 Polychoric
# z -0.1834  0.1247     0.6475          1
# 
# Standard Errors:
#        w      x      y
# w                     
# x 0.2198              
# y 0.2672 0.2391       
# z 0.3521 0.3726 0.2527
# 
# n = 10 
# 
# P-values for Tests of Bivariate Normality:
#        w      x      y
# w                     
# x 0.5489              
# y 0.1887 0.9684       
# z 0.445  0.7139 0.8568

ivreg

ivregは統計的因果推論の操作変数法のための2段階最小二乗法(2SLS)を実行するためのパッケージです。
操作変数法はバックドア基準を満たすための交絡変数が未測定である場合に、介入変数だけに影響する操作変数（instrumental variables, IV）を用いて介入変数の効果を推定する手法。

library(ivreg)
data(Kmenta)

# 通常の回帰分析
Kmenta |>
  lm(Q ~ P + F + A, data = _) |>
  summary()
# Call:
# lm(formula = Q ~ P + F + A, data = Kmenta)
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -4.4074 -1.2224  0.2078  1.6053  3.1591 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 58.27543   11.46291   5.084 0.000111 ***
# P            0.16037    0.09488   1.690 0.110388    
# F            0.24813    0.04619   5.372 6.23e-05 ***
# A            0.24830    0.09752   2.546 0.021567 *  
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 2.405 on 16 degrees of freedom
# Multiple R-squared:  0.6548,	Adjusted R-squared:  0.5901 
# F-statistic: 10.12 on 3 and 16 DF,  p-value: 0.0005602

# 2段階最小二乗法
Kmenta |>
  ivreg(Q ~ P + F + A| D + F + A, data = _) |>
  summary()
# Call:
# ivreg(formula = Q ~ P + F + A | D + F + A, data = Kmenta)
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -4.8724 -1.2593  0.6415  1.4745  3.4865 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept) 49.53244   12.01053   4.124 0.000795 ***
# P            0.24008    0.09993   2.402 0.028785 *  
# F            0.25561    0.04725   5.410 5.79e-05 ***
# A            0.25292    0.09966   2.538 0.021929 *  
# 
# Diagnostic tests:
#                  df1 df2 statistic  p-value    
# Weak instruments   1  16    256.34 2.86e-11 ***
# Wu-Hausman         1  15     36.14 2.38e-05 ***
# Sargan             0  NA        NA       NA    
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 2.458 on 16 degrees of freedom
# Multiple R-Squared: 0.6396,	Adjusted R-squared: 0.572 
# Wald test:  10.7 on 3 and 16 DF,  p-value: 0.0004196

R.I.P.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up