Help us understand the problem. What is going on with this article?

# 回帰のアンサンブル学習

アンサンブル学習でちょっと思ったこと。

kaggleで涙ぐましい努力を見かけますがブレンディングをする時、それぞれのモデルに多重for文で重みを微調整すると割と精度が上がったので忘備録として残しておく。

# 【ブレンディング】

kaggleの住宅価格予測からデータを引っ張ってきています。

```import numpy as np
import pandas as pd
from sklearn.linear_model import ElasticNet
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
import lightgbm as lgb
```
```df = pd.read_csv('train.csv')
```
```clf1 = lgb.LGBMRegressor()
clf2 = DecisionTreeRegressor(max_depth=2)
clf3 = ElasticNet()
```
```X =df.loc[:,['GrLivArea','YearBuilt']]
y = df.loc[:,['SalePrice']]
```
```X.describe()
```
GrLivArea YearBuilt
count 1460.000000 1460.000000
mean 1515.463699 1971.267808
std 525.480383 30.202904
min 334.000000 1872.000000
25% 1129.500000 1954.000000
50% 1464.000000 1973.000000
75% 1776.750000 2000.000000
max 5642.000000 2010.000000
```y.describe()
```
SalePrice
count 1460.000000
mean 180921.195890
std 79442.502883
min 34900.000000
25% 129975.000000
50% 163000.000000
75% 214000.000000
max 755000.000000
```num_test = 0.20
sc = StandardScaler()
X=np.array(X)
y=np.array(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=num_test, random_state=23)
sc.fit(X_train)
X_train=sc.transform(X_train)
sc.fit(X_test)
X_test=sc.transform(X_test)
```
```clf1.fit(X_train,y_train) #LGBM
y_pred1 = clf1.predict(X_test)
np.log(mean_squared_error(y_test ,y_pred1))
```
```21.3135193002795
```
```clf2.fit(X_train,y_train) #決定木
y_pred2 = clf2.predict(X_test)
np.log(mean_squared_error(y_test, y_pred2))
```
```21.734726609958688
```
```clf3.fit(X_train,y_train) #ElasticNet
y_pred3 = clf3.predict(X_test)
np.log(mean_squared_error(y_test, y_pred3))
```
```21.292246763650407
```
```y_pred_all = (y_pred1 + y_pred2 + y_pred3)/3 #単純に三つの回帰を足している
np.log(mean_squared_error(y_test, y_pred_all))
```
```21.198264315267572
```

```y3= 100
best_param={}
for x1 in np.arange(0.1,3.0,0.2):
for x2 in np.arange(0.1,3.0,0.2):
for x3 in np.arange(0.1,3.0,0.2):
y_pred_all = (y_pred1*x1+ y_pred2 * x2+ y_pred3 * x3)/3
y1 = (np.log(mean_squared_error(y_test, y_pred_all)))
best_param[y1] = [x1,x2,x3]
y3 = min(y3,y1)

print(y3)
print(best_param[y3]) #もっとも値の良い重みを出している
```
```21.059275607804445
[1.5000000000000004, 0.1, 1.3000000000000003]
```
Why not register and get more from Qiita?
1. We will deliver articles that match you
By following users and tags, you can catch up information on technical fields that you are interested in as a whole
2. you can read useful information later efficiently
By "stocking" the articles you like, you can search right away