18
17

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

[scikit-learn, matplotlib] 重回帰分析と3D描画

Last updated at Posted at 2016-09-20

scikit-learnに付属しているデータBoston House Prices dataset(ボストンの住宅価格に関するデータセット)を使って重回帰分析と描画を行いました.(参考

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
from mpl_toolkits.mplot3d import Axes3D

#Boston HOuse Pricesのデータをロード
boston = load_boston()
boston_df = pd.DataFrame(boston.data)
boston_df.columns = boston.feature_names
boston['Price'] = boston.target #目的変数をデータフレームに追加

データセットには,各住宅の部屋数(RM),下位パーセンテージ?(LSTAT)等,13の属性(コラム)が含まれます. まずDataFrameクラスのcorr()メソッドを使って各属性と目的変数(Price)の相関係数を計算します.

>> boston_df.corr()
             CRIM        ZN     INDUS      CHAS       NOX        RM       AGE  \
CRIM     1.000000 -0.199458  0.404471 -0.055295  0.417521 -0.219940  0.350784   
ZN      -0.199458  1.000000 -0.533828 -0.042697 -0.516604  0.311991 -0.569537   
INDUS    0.404471 -0.533828  1.000000  0.062938  0.763651 -0.391676  0.644779   
CHAS    -0.055295 -0.042697  0.062938  1.000000  0.091203  0.091251  0.086518   
NOX      0.417521 -0.516604  0.763651  0.091203  1.000000 -0.302188  0.731470   
RM      -0.219940  0.311991 -0.391676  0.091251 -0.302188  1.000000 -0.240265   
AGE      0.350784 -0.569537  0.644779  0.086518  0.731470 -0.240265  1.000000   
DIS     -0.377904  0.664408 -0.708027 -0.099176 -0.769230  0.205246 -0.747881   
RAD      0.622029 -0.311948  0.595129 -0.007368  0.611441 -0.209847  0.456022   
TAX      0.579564 -0.314563  0.720760 -0.035587  0.668023 -0.292048  0.506456   
PTRATIO  0.288250 -0.391679  0.383248 -0.121515  0.188933 -0.355501  0.261515   
B       -0.377365  0.175520 -0.356977  0.048788 -0.380051  0.128069 -0.273534   
LSTAT    0.452220 -0.412995  0.603800 -0.053929  0.590879 -0.613808  0.602339   
Price   -0.385832  0.360445 -0.483725  0.175260 -0.427321  0.695360 -0.376955   

              DIS       RAD       TAX   PTRATIO         B     LSTAT     Price  
CRIM    -0.377904  0.622029  0.579564  0.288250 -0.377365  0.452220 -0.385832  
ZN       0.664408 -0.311948 -0.314563 -0.391679  0.175520 -0.412995  0.360445  
INDUS   -0.708027  0.595129  0.720760  0.383248 -0.356977  0.603800 -0.483725  
CHAS    -0.099176 -0.007368 -0.035587 -0.121515  0.048788 -0.053929  0.175260  
NOX     -0.769230  0.611441  0.668023  0.188933 -0.380051  0.590879 -0.427321  
RM       0.205246 -0.209847 -0.292048 -0.355501  0.128069 -0.613808  0.695360  
AGE     -0.747881  0.456022  0.506456  0.261515 -0.273534  0.602339 -0.376955  
DIS      1.000000 -0.494588 -0.534432 -0.232471  0.291512 -0.496996  0.249929  
RAD     -0.494588  1.000000  0.910228  0.464741 -0.444413  0.488676 -0.381626  
TAX     -0.534432  0.910228  1.000000  0.460853 -0.441808  0.543993 -0.468536  
PTRATIO -0.232471  0.464741  0.460853  1.000000 -0.177383  0.374044 -0.507787  
B        0.291512 -0.444413 -0.441808 -0.177383  1.000000 -0.366087  0.333461  
LSTAT   -0.496996  0.488676  0.543993  0.374044 -0.366087  1.000000 -0.737663  
Price    0.249929 -0.381626 -0.468536 -0.507787  0.333461 -0.737663  1.000000  

各属性の内,相関係数の絶対値が大きい(=相関が大きい)二つの属性RMとLSTATを説明変数として,Priceの値を回帰分析します.

#説明変数要のデータフレーム(説明変数とRMとLSTATを利用)
df = pd.DataFrame()
df['RM'] = boston_df['RM']
df['LSTAT'] = boston_df['LSTAT']

X_multi = df
Y_target = boston.target

#モデル生成とフィッティング
lreg = LinearRegression()
lreg.fit(X_multi, Y_target)
a1, a2 = lreg.coef_ #係数
b = lreg.intercept_ #切片

描画にはmatplotlibのplot_surface()とscatter3Dを利用.

#3D描画(実測値の描画)
x, y, z = np.array(df['RM']), np.array(df['LSTAT']), np.array(Y_target)
fig = plt.figure()
ax = Axes3D(fig)
ax.scatter3D(np.ravel(x), np.ravel(y), np.ravel(z), c = 'red')

#3D描画(回帰平面の描画)
X, Y = np.meshgrid(np.arange(0, 10, 1), np.arange(0, 40, 1))
Z = a1 * X + a2 * Y + b
ax.plot_surface(X, Y, Z, alpha = 0.5) #alphaで透明度を指定
ax.set_xlabel("RM")
ax.set_ylabel("LSTAT")
ax.set_zlabel("Price")

plt.show()

figure_1.png

18
17
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
18
17

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?