Reactive stat 内部の Friedman 検定 の実装 です
Friedman 検定は、3つ以上の対応のあるグループ間の差異を検定するノンパラメトリック手法です。
この記事では、Friedman 検定 の概要と、JavaScriptでの実装方法を説明します。
はじめに
最近、弊社より ブラウザだけで使える無料統計ソフト Reactive stat をリリースしました。
信頼性の高い R で統計解析し、その結果を AI が解説します!
その背景には、統計に苦労している医療者の助けになりたい、という気持ちがあります。
最終的な統計解析は R で行うのですが、レスポンスとサーバー負荷の改善のため、一部の統計計算はブラウザ内で行っています。
そのうち、汎用性の高い部分を共有させていただきたいと思います。
できるだけ R の出力と整合性を持つように javascript をインプリメントしてあるので、参考になれば幸いです。
使用したライブラリ
このコードでは、jStatライブラリを使用しています。jStatは、JavaScriptで統計計算を行うための便利なライブラリです。
jStatの詳細については、公式ドキュメント を参照してください。
全関数の一覧は jStat v1.9.3 Documentation を参照してください。
Friedman検定 の概要
フリードマン検定は、3つ以上の対応のある群間で、群の効果が存在するかどうかを検定する非パラメトリックな手法です。
この検定は中央値の比較を行います。
正規分布を仮定しないので、データの分布が歪んでいる場合や等分散性が仮定できない場合でも利用できます。
同じ被験者に対して複数回、異なる条件や時間点での測定結果を比較する際に使用されます。
例えば、ある治療法の効果を3つの異なる時間点で評価するような場面です。
注意事項
フリードマン検定は対応のあるデータ用ですので、独立した群間のデータには使用できません。
また、検定の力が低い場合があるので、効果サイズやサンプルサイズの計算も検討する必要があります。
計算式
-
各ペアの観測値に順位を付ける:
$$r_{ij} = \text{rank} (X_{ij})$$
ここで、$r_{ij}$ は $j$ 群の $i$ 番目の観測値の順位を表します。 -
各群の順位の和を計算する:
$$R_j = \sum_{i=1}^{n} r_{ij}$$
$R_j$ は $j$ 群の順位の和を表します。 -
同順位の補正を行う:
同順位が存在する場合、フリードマン統計量の計算に影響を与えます。同順位の補正は以下の式で行われます。
$$\sum_{i=1}^{n} \sum_{j=1}^{k} (r_{ij}^3 - r_{ij})$$
ここで、$r_{ij}$ は $j$ 群の $i$ 番目の観測値の順位を表します。 -
フリードマン統計量を計算する:
$$\chi^2 = \frac{12}{n \times k \times (k + 1)} \times \left(\sum_{j=1}^{k} R_j^2\right) - 3n(k+1) - \frac{1}{n \times k \times (k-1)} \times \left(\sum_{i=1}^{n} \sum_{j=1}^{k} (r_{ij}^3 - r_{ij})\right)$$
ここで、$n$ は各群の観測数、$k$ は群の数です。 -
自由度 $k - 1$ の $\chi^2$ 分布を用いてp値を計算します。
コード
ライブラリ込みの html にしてありますので、
jsfiddle などにコピペして簡単に試せます。
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Friedman Test</title>
<script src="https://cdn.jsdelivr.net/npm/jstat@latest/dist/jstat.min.js"></script>
</head>
<body>
<h1>Friedman Test Example</h1>
<p>Check the console for the result.</p>
<script>
function friedmanTest(...groups) {
// データを転置して、各行がブロックに対応するようにする
var y = jStat.transpose(groups);
// ブロックの数を0からy.length - 1までの連続した整数で表す
var blocks = jStat.seq(0, y.length - 1);
var k = groups.length;
var n = y.length;
// 各ブロック内でのランク付け
var r = y.map(function(row) {
return row.map(function(x, i) {
return jStat.rank(row)[i];
});
});
// 同順位の数を数える
var ties = r.map(function(row) {
var counts = {};
row.forEach(function(x) {
counts[x] = (counts[x] || 0) + 1;
});
return counts;
});
// 各グループのランクの合計を計算
var colSums = jStat.transpose(r).map(function(col) {
return col.reduce(function(a, b) {
return a + b;
}, 0);
});
// Friedman検定の統計量を計算
// k-1: 自由度
var statistic = 12 * jStat.sum(colSums.map(function(x) {
return Math.pow(x - n * (k + 1) / 2, 2);
})) / (n * k * (k + 1) - jStat.sum(ties.map(function(tie) {
return jStat.sum(Object.values(tie).map(function(u) {
return Math.pow(u, 3) - u;
}));
})) / (k - 1));
var pVal = 1 - jStat.chisquare.cdf(statistic, k - 1);
return pVal;
}
// サンプルデータ
const groups = [["254","269","292","248","258","232","332","215","285","202","168","341","329","269","232","284","213","237","231","221","286","233","177","270","286","183","341","308","270","270","176","283","276","176","231","301","173","169","344","192","188","223","287","218","280","152","334","179","197","179","200","219","249","246","256","214","295","253","317","364","258","326","201","297","227","353","302","290","309","263","261","168","328","250","254","375","258","249","204","218","92","267","277","245","174","246","279","247","147","320","255","315","171","177","150","292","187","243","217","175"],["252","255","263","272","262","237","353","223","263","200","170","331","348","275","233","276","256","248","232","230","285","236","164","313","314","174","319","294","281","266","158","278","268","180","229","270","182","153","359","192","170","224","274","248","260","168","318","195","263","213","196","224","290","287","281","227","291","230","343","359","246","288","209","288","205","345","305","291","315","247","262","179","338","223","242","409","251","230","233","210","78","312","263","259","180","231","294","260","157","328","275","325","209","184","165","309","181","203","208","154"],["266","297","314","225","282","223","337","220","304","224","157","339","349","269","255","316","198","294","224","206","266","215","233","285","310","163","334","282","279","277","174","289","269","212","225","324","159","160","393","195","187","185","295","213","266","169","351","205","209","219","246","225","258","263","271","203","289","278","339","344","236","346","212","293","203","363","275","293","311","272","272","200","340","212","241","391","244","239","197","237","64","285","321","257","190","261","264","256","180","320","241","299","142","206","185","325","207","219","222","167"],["259","297","296","263","318","245","326","224","279","205","171","365","334","304","259","276","207","228","201","206","255","233","151","285","326","177","384","316","303","261","155","298","295","199","267","303","157","167","339","172","198","226","301","213","301","182","345","166","244","203","188","226","258","303","267","264","309","263","323","359","293","346","232","313","274","365","351","269","322","257","286","165","350","243","263","380","271","288","209","184","111","305","277","229","170","253","291","296","141","328","291","326","194","188","132","306","195","262","212","176"]];
const pValue = friedmanTest(...groups);
console.log("Friedman Test p-value: ", pValue);
</script>
</body>
</html>
検証用 R コード
rdrr にコピペして簡単に試せます。
# Friedman Test
# サンプルデータ
data <- data.frame(
Subject = factor(rep(1:100, times=4)),
Time = factor(rep(1:4, each=100), , labels=c("A","B","C","D")),
Value = c(254,269,292,248,258,232,332,215,285,202,168,341,329,269,232,284,213,237,231,221,286,233,177,270,286,183,341,308,270,270,176,283,276,176,231,301,173,169,344,192,188,223,287,218,280,152,334,179,197,179,200,219,249,246,256,214,295,253,317,364,258,326,201,297,227,353,302,290,309,263,261,168,328,250,254,375,258,249,204,218,92,267,277,245,174,246,279,247,147,320,255,315,171,177,150,292,187,243,217,175,252,255,263,272,262,237,353,223,263,200,170,331,348,275,233,276,256,248,232,230,285,236,164,313,314,174,319,294,281,266,158,278,268,180,229,270,182,153,359,192,170,224,274,248,260,168,318,195,263,213,196,224,290,287,281,227,291,230,343,359,246,288,209,288,205,345,305,291,315,247,262,179,338,223,242,409,251,230,233,210,78,312,263,259,180,231,294,260,157,328,275,325,209,184,165,309,181,203,208,154,266,297,314,225,282,223,337,220,304,224,157,339,349,269,255,316,198,294,224,206,266,215,233,285,310,163,334,282,279,277,174,289,269,212,225,324,159,160,393,195,187,185,295,213,266,169,351,205,209,219,246,225,258,263,271,203,289,278,339,344,236,346,212,293,203,363,275,293,311,272,272,200,340,212,241,391,244,239,197,237,64,285,321,257,190,261,264,256,180,320,241,299,142,206,185,325,207,219,222,167,259,297,296,263,318,245,326,224,279,205,171,365,334,304,259,276,207,228,201,206,255,233,151,285,326,177,384,316,303,261,155,298,295,199,267,303,157,167,339,172,198,226,301,213,301,182,345,166,244,203,188,226,258,303,267,264,309,263,323,359,293,346,232,313,274,365,351,269,322,257,286,165,350,243,263,380,271,288,209,184,111,305,277,229,170,253,291,296,141,328,291,326,194,188,132,306,195,262,212,176)
)
# Friedman 検定を実行
result <- with(data, friedman.test(Value ~ Time | Subject))
# 結果の表示
print(result)
注意点
- このコードは、jStatライブラリに依存しています。jStatを読み込まないと動作しません
- 浮動小数点数の計算では、わずかな誤差が生じる可能性があります
- Friedman 検定は、対応のあるデータに対してのみ適用できます。独立したグループには適用できません
- Friedman 検定は、各群の中央値に差があるかどうかを検定します。平均値の差を検定するものではありません
最後に
コードの内容はよく吟味し、R との食い違いが極力ないように気を付けていますが、もし間違いに気づかれた場合には指摘していただけますとありがたいです。
ただ、Chat GPT に読ませてその出力を検証もせず「問題があるので修正します」として貼り付けられると、他の方を混乱させてしまうのと、弊社の製品に対する信頼を損ねることにもつながりかねませんので、どうかそのようなことのないようにお願いいたします。
検証のための R のコードもつけてありますので、ご活用いただけますと幸いです。