More than 5 years have passed since last update.

F#でt検定　2クラスの試験成績の比較

Posted at 2019-07-19

概要

データ分析なら「Python」か「R」がいいよね。

・・・ということで「F#」で「対応なしのt検定」を行なってみました。内容は「Pythonでt検定　2クラスの試験成績の比較」の移植です。統計量の解説などは Python版のエントリを参照ください。

なお、F#でデータ分析するにあたって Accord.NET というパッケージを利用しました。あらかじめ nuget で Accord.Statistics を導入しておいてください。ここではバージョン 3.8.0 を使いました。

コード

対応なしt検定

open System
open Accord.Statistics.Testing
open Accord.Statistics.Distributions.Univariate

type m = Accord.Statistics.Measures

[<EntryPoint>]
let main argv =

  let xa = [|75; 87; 89; 80; 84; 81; 88; 83; 88; 88; 82; 72; 74; 93; 77; 67; 88; 84; 68; 84; 80; 78; 75; 71; 82; 74; 84; 77; 79; 76; 83; 75; 86; 76; 80; 76; 68; 72; 75; 85 |] 
           |> Array.map float
  let xb = [|64; 77; 79; 73; 89; 82; 59; 85; 80; 75; 65; 79; 65; 74; 73; 72; 69; 83; 90; 73; 88; 59; 62; 80; 64; 74; 81; 70; 69; 67; 81; 67; 72; 71; 72; 78; 78; 82; 72; 71|]
           |> Array.map float

  printfn "A組の平均点 = %.2f" <| m.Mean(xa)
  printfn "B組の平均点 = %.2f" <| m.Mean(xb)
  printfn "2群の平均点の差 = %.2f" <| Math.Abs( m.Mean(xa) - m.Mean(xb))

  printfn ""
  printfn "標本Aの母集団の平均の推定値 = %.2f" <| m.Mean(xa)
  printfn "標本Bの母集団の平均の推定値 = %.2f" <| m.Mean(xb)
  printfn "標本Aの母集団の標準偏差の推定値（不偏標準偏差）= %.2f" <| m.StandardDeviation(xa,true)
  printfn "標本Bの母集団の標準偏差の推定値（不偏標準偏差）= %.2f" <| m.StandardDeviation(xb,true)

  printfn ""
  printfn "シャピロ・ウィルク検定"
  let swa = new ShapiroWilkTest(xa)
  let swb = new ShapiroWilkTest(xb)
  printfn "標本A p値 = %.3f" <| swa.PValue
  printfn "標本B p値 = %.3f" <| swb.PValue

  printfn ""
  let t = new TDistribution( float (xa.Length-1) )
  let CI1 = m.Mean(xa) + m.StandardError(xa) * t.InverseDistributionFunction(0.025)
  let CI2 = m.Mean(xa) + m.StandardError(xa) * t.InverseDistributionFunction(0.975)
  printfn "標本Aの母平均の95%%信頼区間CI = [%.2f , %.2f]" CI1 CI2
   
  printfn ""
  printfn "ルビーン検定"
  let lvt = new LeveneTest([|xa;xb|], false)
  printfn "p値 = %.3f" <| lvt.PValue
  
  printfn ""
  printfn "対応なしt検定"
  let tt = new TwoSampleTTest(xa,xb,true)
  printfn "p値 = %.3f" <| tt.PValue
  printfn "t値 = %.2f" <| tt.Statistic
  printfn "平均値の差   = %.2f" <| tt.ObservedDifference
  printfn "差の標準誤差 = %.2f" <| tt.StandardError

  let CI1 = tt.ObservedDifference + tt.StandardError * tt.StatisticDistribution.InverseDistributionFunction(0.025)
  let CI2 = tt.ObservedDifference + tt.StandardError * tt.StatisticDistribution.InverseDistributionFunction(0.975)

  printfn "平均値の差の95%%信頼区間CI = [%.2f , %.2f]" CI1 CI2

  Console.ReadKey() |> ignore
  0

実行結果

Python で scipy.stats により処理したときと同じ結果が得られています。

実行結果

A組の平均点 = 79.60
B組の平均点 = 74.10
2群の平均点の差 = 5.50

標本Aの母集団の平均の推定値 = 79.60
標本Bの母集団の平均の推定値 = 74.10
標本Aの母集団の標準偏差の推定値（不偏標準偏差）= 6.42
標本Bの母集団の標準偏差の推定値（不偏標準偏差）= 7.89

シャピロ・ウィルク検定
標本A p値 = 0.558
標本B p値 = 0.747

標本Aの母平均の95%信頼区間CI = [77.55 , 81.65]

ルビーン検定
p値 = 0.272

対応なしt検定
p値 = 0.001
t値 = 3.42
平均値の差   = 5.50
差の標準誤差 = 1.61
平均値の差の95%信頼区間CI = [2.30 , 8.70]

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

F#でt検定 2クラスの試験成績の比較

概要

コード

実行結果

F#でt検定　2クラスの試験成績の比較