LoginSignup
2
3

More than 5 years have passed since last update.

線形重回帰を本当はPythonで解きたいけど表計算で解けと言われたので

Last updated at Posted at 2018-10-30

線形回帰を本当はPythonで解きたいけど表計算で解けと言われたのでの続編です。

用いるデータ

今回も分子性物質のデータ(融点・沸点)を使います。前回は分子量だけを使って沸点を予測しようという無理ゲーでしたが、今度は分子量だけでなく双極子モーメントも使って沸点を予測してみましょう。

import pandas as pd
data = [['HF', 19.5, 20.0, 1.826567],
        ['HCl', -84.9, 36.5, 1.1086],
        ['HBr', -67.0, 80.9, 0.8271],
        ['HI', -35.1, 127.9, 0.4477],
        ['H2O', 100.0, 18.0, 1.8546],
        ['H2S', -60.7, 34.1, 0.978325],
        ['H2Se', -42, 81.0, 0.627],
        ['NH3', -33.4, 17.0, 1.471772],
        ['PH3', -87, 34.0, 0.57397],
        ['AsH3', -55, 77.9, 0.217],
        ['SbH3', -17.1, 124.8, 0.116],
        ['CH4', -161.49, 16.0, 0],
        ['SiH4', -111.8, 32.1, 0],
        ['GeH4', -90, 76.6, 0],
        ['SnH4', -52, 122.7, 0],
        ['He', -268.934, 4.0, 0],
        ['Ne', -246.048, 20.2, 0],
        ['Ar', -185.7, 39.9, 0],
        ['Kr', -152.3, 83.8, 0],
        ['Xe', -108.1, 131.3, 0],
       ]
df = pd.DataFrame(data, columns = ['molecule', 'boiling point', 'molecular weight', 'dipole monent'])
df
molecule boiling point molecular weight dipole monent
0 HF 19.500 20.0 1.826567
1 HCl -84.900 36.5 1.108600
2 HBr -67.000 80.9 0.827100
3 HI -35.100 127.9 0.447700
4 H2O 100.000 18.0 1.854600
5 H2S -60.700 34.1 0.978325
6 H2Se -42.000 81.0 0.627000
7 NH3 -33.400 17.0 1.471772
8 PH3 -87.000 34.0 0.573970
9 AsH3 -55.000 77.9 0.217000
10 SbH3 -17.100 124.8 0.116000
11 CH4 -161.490 16.0 0.000000
12 SiH4 -111.800 32.1 0.000000
13 GeH4 -90.000 76.6 0.000000
14 SnH4 -52.000 122.7 0.000000
15 He -268.934 4.0 0.000000
16 Ne -246.048 20.2 0.000000
17 Ar -185.700 39.9 0.000000
18 Kr -152.300 83.8 0.000000
19 Xe -108.100 131.3 0.000000

説明変数Xは以下のようになります。

X = df.loc[:, ['molecular weight', 'dipole monent']].as_matrix()
X
array([[  2.00000000e+01,   1.82656700e+00],
       [  3.65000000e+01,   1.10860000e+00],
       [  8.09000000e+01,   8.27100000e-01],
       [  1.27900000e+02,   4.47700000e-01],
       [  1.80000000e+01,   1.85460000e+00],
       [  3.41000000e+01,   9.78325000e-01],
       [  8.10000000e+01,   6.27000000e-01],
       [  1.70000000e+01,   1.47177200e+00],
       [  3.40000000e+01,   5.73970000e-01],
       [  7.79000000e+01,   2.17000000e-01],
       [  1.24800000e+02,   1.16000000e-01],
       [  1.60000000e+01,   0.00000000e+00],
       [  3.21000000e+01,   0.00000000e+00],
       [  7.66000000e+01,   0.00000000e+00],
       [  1.22700000e+02,   0.00000000e+00],
       [  4.00000000e+00,   0.00000000e+00],
       [  2.02000000e+01,   0.00000000e+00],
       [  3.99000000e+01,   0.00000000e+00],
       [  8.38000000e+01,   0.00000000e+00],
       [  1.31300000e+02,   0.00000000e+00]])

目的変数yは前回と同じ、沸点です。

Y = df['boiling point'].as_matrix()
Y
array([  19.5  ,  -84.9  ,  -67.   ,  -35.1  ,  100.   ,  -60.7  ,
        -42.   ,  -33.4  ,  -87.   ,  -55.   ,  -17.1  , -161.49 ,
       -111.8  ,  -90.   ,  -52.   , -268.934, -246.048, -185.7  ,
       -152.3  , -108.1  ])

沸点、分子量、双極子モーメントがどのような関係にあるか散布図行列で概観してみましょう。

%matplotlib inline
import matplotlib.pyplot as plt
from pandas.tools.plotting import *
scatter_matrix(df)
plt.show()

重回帰を本当はPythonで解きたいけど表計算で解けと言われたので_7_0.png

まずは一番便利な scikit-learn から

前回、scikit-learnを使えば簡単に線形回帰できることをお示ししました。変数が増えて線形重回帰になっても、ほとんど手間が増えません。そう、scikit-learnならね。

from sklearn import linear_model
lr = linear_model.LinearRegression()
lr.fit(X, Y)
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
# 回帰係数
lr.coef_
array([   1.17775083,  124.38195483])
# 切片
lr.intercept_
-218.85778226168119
print("y = f(x) = w1x1 + w2x2 + t; (w1, w2, t) = ({0}, {1}, {2})".format(lr.coef_[0], lr.coef_[1], lr.intercept_))
y = f(x) = w1x1 + w2x2 + t; (w1, w2, t) = (1.1777508314139944, 124.38195482549655, -218.8577822616812)
# 決定係数
lr.score(X, Y)
0.78084985722905187
lr.predict([200, 0.5])
array([ 78.88336143])
lr.predict([[150, 0], [100, 1.0]])
array([-42.19515755,  23.29925571])

次は、ガチPythonで。

scikit-learnを使わずにガチPythonで解こうと思うと、少し大変ですね。頑張ってみましょう。

# 平均値を求める関数
def mean(list):
    sum = 0
    for x in list:
        sum += x
    return sum / len(list)
# 分散を求める関数
def variance(list):
    ave = mean(list)
    sum = 0
    for x in list:
        sum += (x - ave) ** 2
    return sum / len(list)
# 標準偏差を求める関数
import math
def standard_deviation(list):
    return math.sqrt(variance(list))
# 共分散 = 偏差積の平均
def covariance(list1, list2): 
    mean1 = mean(list1)
    mean2 = mean(list2)
    sum = 0
    for d1, d2 in zip(list1, list2):
        sum += (d1 - mean1) * (d2 - mean2)
    return sum / len(list1)
# 相関係数 = 共分散を list1, list2 の標準偏差で割ったもの
def correlation(list1, list2):
    return covariance(list1, list2) / (standard_deviation(list1) * standard_deviation(list2))
# a の影響を除いた、b と y の偏回帰係数 partial regression coefficient を求める関数
def partial_regression(a, b, y):
    rby = correlation(b, y)
    ray = correlation(a, y)
    rab = correlation(a, b)
    return (rby - ray * rab) * standard_deviation(y) / ((1 - rab ** 2) * standard_deviation(b))
# 定数 w1 = (x2 の影響を除いた、x1 と y の偏回帰係数)
w1 = partial_regression(X[:, 1], X[:, 0], Y)
# 定数 w2 = (x1 の影響を除いた、x2 と y の偏回帰係数)
w2 = partial_regression(X[:, 0], X[:, 1], Y)
# 定数 t = yの平均 - w1*x1の平均 - w2*x2の平均
t = mean(Y) - w1 * mean(X[:, 0]) - w2 * mean(X[:, 1])
# 回帰直線の式を表示
print("y = f(x) = w1x1 + w2x2 + t; (w1, w2, t) = ({0}, {1}, {2})".format(w1, w2, t))
y = f(x) = w1x1 + w2x2 + t; (w1, w2, t) = (1.1777508314139928, 124.38195482549644, -218.85778226168105)
def f(X):
    return w1 * X[:, 0] + w2 * X[:, 1] + t
r2 = 1. - sum((Y - f(X))**2) / sum((Y - mean(Y))**2)
r2
0.78084985722905187

さて、表計算で解けと言われたので pandas で書いてみましょうか。

二度とやりたくないくらい大変です。さあどうぞ。

import copy
from IPython.display import display
excel = copy.deepcopy(df)
excel
molecule boiling point molecular weight dipole monent
0 HF 19.500 20.0 1.826567
1 HCl -84.900 36.5 1.108600
2 HBr -67.000 80.9 0.827100
3 HI -35.100 127.9 0.447700
4 H2O 100.000 18.0 1.854600
5 H2S -60.700 34.1 0.978325
6 H2Se -42.000 81.0 0.627000
7 NH3 -33.400 17.0 1.471772
8 PH3 -87.000 34.0 0.573970
9 AsH3 -55.000 77.9 0.217000
10 SbH3 -17.100 124.8 0.116000
11 CH4 -161.490 16.0 0.000000
12 SiH4 -111.800 32.1 0.000000
13 GeH4 -90.000 76.6 0.000000
14 SnH4 -52.000 122.7 0.000000
15 He -268.934 4.0 0.000000
16 Ne -246.048 20.2 0.000000
17 Ar -185.700 39.9 0.000000
18 Kr -152.300 83.8 0.000000
19 Xe -108.100 131.3 0.000000
excel['y'] = excel['boiling point']
excel['x1'] = excel['molecular weight']
excel['x2'] = excel['dipole monent']
mean_y = mean(excel['y'])
mean_x1 = mean(excel['x1'])
mean_x2 = mean(excel['x2'])
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2]], columns=['y','x1', 'x2'], index=['mean']))
molecule boiling point molecular weight dipole monent y x1 x2
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000
y x1 x2
mean -86.9536 58.935 0.502432
excel['y-mean(y)'] = [y - mean_y for y in excel['y']]
excel['x1-mean(x1)'] = [x1 - mean_x1 for x1 in excel['x1']]
excel['x2-mean(x2)'] = [x2 - mean_x2 for x2 in excel['x2']]
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2]], columns=['y','x1', 'x2'], index=['mean']))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2)
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432
y x1 x2
mean -86.9536 58.935 0.502432
excel['(y-mean(y))**2'] = [sa ** 2 for sa in excel['y-mean(y)']]
excel['(x1-mean(x1))**2'] = [sa ** 2 for sa in excel['x1-mean(x1)']]
excel['(x2-mean(x2))**2'] = [sa ** 2 for sa in excel['x2-mean(x2)']]
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2]], columns=['y','x1', 'x2'], index=['mean']))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438
y x1 x2
mean -86.9536 58.935 0.502432
variance_y = mean(excel['(y-mean(y))**2'])
variance_x1 = mean(excel['(x1-mean(x1))**2'])
variance_x2 = mean(excel['(x2-mean(x2))**2'])
sd_y = math.sqrt(variance_y)
sd_x1 = math.sqrt(variance_x1)
sd_x2 = math.sqrt(variance_x2)
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                             [variance_y, variance_x1, variance_x2], 
                             [sd_y, sd_x1, sd_x2]], 
                            columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
excel['(x1-mean(x1)) * (x2-mean(x2))'] = excel['x1-mean(x1)'] * excel['x2-mean(x2)']
excel['(y-mean(y)) * (x1-mean(x1))'] = excel['y-mean(y)'] * excel['x1-mean(x1)']
excel['(y-mean(y)) * (x2-mean(x2))'] = excel['y-mean(y)'] * excel['x2-mean(x2)']
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                             [variance_y, variance_x1, variance_x2], 
                             [sd_y, sd_x1, sd_x2]], 
                            columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2))
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
covar_x1x2 = mean(excel['(x1-mean(x1)) * (x2-mean(x2))'])
covar_x1y = mean(excel['(y-mean(y)) * (x1-mean(x1))'])
covar_x2y = mean(excel['(y-mean(y)) * (x2-mean(x2))'])

corr_x1x2 = covar_x1x2 / (sd_x1 * sd_x2)
corr_x1y = covar_x1y / (sd_x1 * sd_y)
corr_x2y = covar_x2y / (sd_x2 * sd_y)
display(excel, pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                             [variance_y, variance_x1, variance_x2], 
                             [sd_y, sd_x1, sd_x2]], 
                            columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']),
       pd.DataFrame([[covar_x1x2, covar_x1y, covar_x2y], [corr_x1x2, corr_x1y, corr_x2y]], 
                    index=['covariance', 'correlation'], columns=['x1,x2','x1,y', 'x2,y']))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2))
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
x1,x2 x1,y x2,y
covariance -9.880045 819.944636 36.543167
correlation -0.380609 0.234005 0.698912
w1 = (corr_x1y - corr_x2y * corr_x1x2) * sd_y / ((1 - corr_x1x2 ** 2) * sd_x1)
w2 = (corr_x2y - corr_x1y * corr_x1x2) * sd_y / ((1 - corr_x1x2 ** 2) * sd_x2)
t = mean_y - w1 * mean_x1 - w2 * mean_x2
display(excel, 
        pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                      [variance_y, variance_x1, variance_x2], 
                      [sd_y, sd_x1, sd_x2]], 
                     columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']),
        pd.DataFrame([[covar_x1x2, covar_x1y, covar_x2y], [corr_x1x2, corr_x1y, corr_x2y]], 
                     index=['covariance', 'correlation'], columns=['x1,x2','x1,y', 'x2,y']),
        pd.DataFrame([[w1, w2, t]], columns=["w1", "w2", "t"], index=["y = f(x) = w1x1 + w2x2 + t"]))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2))
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
x1,x2 x1,y x2,y
covariance -9.880045 819.944636 36.543167
correlation -0.380609 0.234005 0.698912
w1 w2 t
y = f(x) = w1x1 + w2x2 + t 1.177751 124.381955 -218.857782
def f(x1, x2):
    return w1 * x1 + w2 * x2 + t
excel['f(x1,x2)'] = f(excel['x1'], excel['x2'])
display(excel, 
        pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                      [variance_y, variance_x1, variance_x2], 
                      [sd_y, sd_x1, sd_x2]], 
                     columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']),
        pd.DataFrame([[covar_x1x2, covar_x1y, covar_x2y], [corr_x1x2, corr_x1y, corr_x2y]], 
                     index=['covariance', 'correlation'], columns=['x1,x2','x1,y', 'x2,y']),
        pd.DataFrame([[w1, w2, t]], columns=["w1", "w2", "t"], index=["y = f(x) = w1x1 + w2x2 + t"]))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2)) f(x1,x2)
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970 31.889208
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827 -37.980042
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301 -20.701425
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036 -12.537650
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731 33.020506
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912 -57.010503
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794 -45.472479
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663 -15.774140
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319 -107.422743
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570 -100.120108
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645 -57.446172
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450 -200.013769
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619 -181.051981
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608 -128.642069
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797 -74.347755
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722 -214.146779
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070 -195.067215
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322 -171.865524
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103 -120.162263
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622 -64.219098
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
x1,x2 x1,y x2,y
covariance -9.880045 819.944636 36.543167
correlation -0.380609 0.234005 0.698912
w1 w2 t
y = f(x) = w1x1 + w2x2 + t 1.177751 124.381955 -218.857782
excel['(y-f(x1,x2))**2'] = (excel['y'] - excel['f(x1,x2)'])**2
display(excel, 
        pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                      [variance_y, variance_x1, variance_x2], 
                      [sd_y, sd_x1, sd_x2]], 
                     columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']),
        pd.DataFrame([[covar_x1x2, covar_x1y, covar_x2y], [corr_x1x2, corr_x1y, corr_x2y]], 
                     index=['covariance', 'correlation'], columns=['x1,x2','x1,y', 'x2,y']),
        pd.DataFrame([[w1, w2, t]], columns=["w1", "w2", "t"], index=["y = f(x) = w1x1 + w2x2 + t"]))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2)) f(x1,x2) (y-f(x1,x2))**2
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970 31.889208 153.492486
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827 -37.980042 2201.482478
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301 -20.701425 2143.558032
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036 -12.537650 509.059649
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731 33.020506 4486.252600
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912 -57.010503 13.612388
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794 -45.472479 12.058112
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663 -15.774140 310.670951
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319 -107.422743 417.088447
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570 -100.120108 2035.824173
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645 -57.446172 1627.813574
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450 -200.013769 1484.080775
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619 -181.051981 4795.836813
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608 -128.642069 1493.209464
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797 -74.347755 499.422165
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722 -214.146779 3001.639592
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070 -195.067215 2599.040392
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322 -171.865524 191.392724
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103 -120.162263 1032.834166
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622 -64.219098 1925.533552
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
x1,x2 x1,y x2,y
covariance -9.880045 819.944636 36.543167
correlation -0.380609 0.234005 0.698912
w1 w2 t
y = f(x) = w1x1 + w2x2 + t 1.177751 124.381955 -218.857782
r2 = 1. - sum(excel['(y-f(x1,x2))**2']) / sum(excel['(y-mean(y))**2'])
display(excel, 
        pd.DataFrame([[mean_y, mean_x1, mean_x2], 
                      [variance_y, variance_x1, variance_x2], 
                      [sd_y, sd_x1, sd_x2]], 
                     columns=['y','x1', 'x2'], index=['mean', 'variance', 'sd']),
        pd.DataFrame([[covar_x1x2, covar_x1y, covar_x2y], [corr_x1x2, corr_x1y, corr_x2y]], 
                     index=['covariance', 'correlation'], columns=['x1,x2','x1,y', 'x2,y']),
        pd.DataFrame([[w1, w2, t, r2]], columns=["w1", "w2", "t", "r2"], index=["y = f(x) = w1x1 + w2x2 + t"]))
molecule boiling point molecular weight dipole monent y x1 x2 y-mean(y) x1-mean(x1) x2-mean(x2) (y-mean(y))**2 (x1-mean(x1))**2 (x2-mean(x2))**2 (x1-mean(x1)) * (x2-mean(x2)) (y-mean(y)) * (x1-mean(x1)) (y-mean(y)) * (x2-mean(x2)) f(x1,x2) (y-f(x1,x2))**2
0 HF 19.500 20.0 1.826567 19.500 20.0 1.826567 106.4536 -38.935 1.324135 11332.368953 1515.934225 1.753334 -51.555208 -4144.770916 140.958970 31.889208 153.492486
1 HCl -84.900 36.5 1.108600 -84.900 36.5 1.108600 2.0536 -22.435 0.606168 4.217273 503.329225 0.367440 -13.599386 -46.072516 1.244827 -37.980042 2201.482478
2 HBr -67.000 80.9 0.827100 -67.000 80.9 0.827100 19.9536 21.965 0.324668 398.146153 482.461225 0.105410 7.131339 438.280824 6.478301 -20.701425 2143.558032
3 HI -35.100 127.9 0.447700 -35.100 127.9 0.447700 51.8536 68.965 -0.054732 2688.795833 4756.171225 0.002996 -3.774572 3576.083524 -2.838036 -12.537650 509.059649
4 H2O 100.000 18.0 1.854600 100.000 18.0 1.854600 186.9536 -40.935 1.352168 34951.648553 1675.674225 1.828359 -55.351009 -7652.945616 252.792731 33.020506 4486.252600
5 H2S -60.700 34.1 0.978325 -60.700 34.1 0.978325 26.2536 -24.835 0.475893 689.251513 616.777225 0.226474 -11.818810 -652.008156 12.493912 -57.010503 13.612388
6 H2Se -42.000 81.0 0.627000 -42.000 81.0 0.627000 44.9536 22.065 0.124568 2020.826153 486.864225 0.015517 2.748600 991.901184 5.599794 -45.472479 12.058112
7 NH3 -33.400 17.0 1.471772 -33.400 17.0 1.471772 53.5536 -41.935 0.969340 2867.988073 1758.544225 0.939621 -40.649285 -2245.770216 51.911663 -15.774140 310.670951
8 PH3 -87.000 34.0 0.573970 -87.000 34.0 0.573970 -0.0464 -24.935 0.071538 0.002153 621.754225 0.005118 -1.783808 1.156984 -0.003319 -107.422743 417.088447
9 AsH3 -55.000 77.9 0.217000 -55.000 77.9 0.217000 31.9536 18.965 -0.285432 1021.032553 359.671225 0.081471 -5.413212 606.000024 -9.120570 -100.120108 2035.824173
10 SbH3 -17.100 124.8 0.116000 -17.100 124.8 0.116000 69.8536 65.865 -0.386432 4879.525433 4338.198225 0.149329 -25.452324 4600.907364 -26.993645 -57.446172 1627.813574
11 CH4 -161.490 16.0 0.000000 -161.490 16.0 0.000000 -74.5364 -42.935 -0.502432 5555.674925 1843.414225 0.252438 21.571905 3200.220334 37.449450 -200.013769 1484.080775
12 SiH4 -111.800 32.1 0.000000 -111.800 32.1 0.000000 -24.8464 -26.835 -0.502432 617.343593 720.117225 0.252438 13.482755 666.753144 12.483619 -181.051981 4795.836813
13 GeH4 -90.000 76.6 0.000000 -90.000 76.6 0.000000 -3.0464 17.665 -0.502432 9.280553 312.052225 0.252438 -8.875456 -53.814656 1.530608 -128.642069 1493.209464
14 SnH4 -52.000 122.7 0.000000 -52.000 122.7 0.000000 34.9536 63.765 -0.502432 1221.754153 4065.975225 0.252438 -32.037557 2228.816304 -17.561797 -74.347755 499.422165
15 He -268.934 4.0 0.000000 -268.934 4.0 0.000000 -181.9804 -54.935 -0.502432 33116.865984 3017.854225 0.252438 27.601085 9997.093274 91.432722 -214.146779 3001.639592
16 Ne -246.048 20.2 0.000000 -246.048 20.2 0.000000 -159.0944 -38.735 -0.502432 25311.028111 1500.400225 0.252438 19.461692 6162.521584 79.934070 -195.067215 2599.040392
17 Ar -185.700 39.9 0.000000 -185.700 39.9 0.000000 -98.7464 -19.035 -0.502432 9750.851513 362.331225 0.252438 9.563787 1879.637724 49.613322 -171.865524 191.392724
18 Kr -152.300 83.8 0.000000 -152.300 83.8 0.000000 -65.3464 24.865 -0.502432 4270.151993 618.268225 0.252438 -12.492964 -1624.838236 32.832103 -120.162263 1032.834166
19 Xe -108.100 131.3 0.000000 -108.100 131.3 0.000000 -21.1464 72.365 -0.502432 447.170233 5236.693225 0.252438 -36.358470 -1530.259236 10.624622 -64.219098 1925.533552
y x1 x2
mean -86.953600 58.935000 0.502432
variance 7057.696185 1739.624275 0.387350
sd 84.010096 41.708803 0.622375
x1,x2 x1,y x2,y
covariance -9.880045 819.944636 36.543167
correlation -0.380609 0.234005 0.698912
w1 w2 t r2
y = f(x) = w1x1 + w2x2 + t 1.177751 124.381955 -218.857782 0.78085

できたっ!

2
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
3