LoginSignup
4
5

More than 5 years have passed since last update.

Python - 数値データファイルからデータを読み込み共分散を計算

Last updated at Posted at 2017-02-18

以下のようなタブ区切りの数値データ系列ファイルからデータを読み込み共分散を計算するプログラム。

MultipleRegressionAnalysis_Data1.txt
65.7    67.8    70.3    72.0    74.3    76.2
3.27    3.06    4.22    4.10    5.26    6.18
69.7    69.7    71.3    77.6    81.0    78.7
correlation.py
import numpy as np

def load_data(filename):
    x = []
    for line in open(filename, 'r'):
        x.append([])
        for data in line.strip().split('\t'):
            x[len(x)-1].append(float(data))
    return x

def calc_correlation(data1, data2):
    ave1 = calc_average(data1)
    ave2 = calc_average(data2)
    sum = 0.0
    for i in range(len(data1)):
        sum += data1[i] * data2[i]
    return (sum / len(data1)) - (ave1 * ave2)

def calc_average(data):
    sum = 0.0
    for d in data:
        sum += d
    return sum / len(data)

if __name__ == "__main__":
    data = load_data('MultipleRegressionAnalysis_Data1.txt')
    corr = []
    for i in range(len(data)):
        corr.append([])
        for j in range(len(data)):
            corr[len(corr)-1].append(calc_correlation(data[i],data[j]))

    print(np.array(corr))
結果
>python correlation.py
[[ 12.95583333   3.70208333  14.89666667]
 [  3.70208333   1.18114722   4.10327778]
 [ 14.89666667   4.10327778  20.94222222]]
4
5
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
5