1
1

More than 3 years have passed since last update.

Kaggle【HousePrices】に挑戦

Last updated at Posted at 2019-11-07

概要

敷地面積、築年数などの特徴量から家の価格を予測するコンペに挑んでみた。

ライブラリ定義

import pandas as pd

学習データ、テストデータの読み込み

train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')

学習データ、テストデータの分割

目的変数「SalePrice」とそれ以外の従属変数に分離する。

train_x = train.drop(['Id', 'SalePrice'], axis=1)
train_y = train['SalePrice']
test_x = test.drop(['Id'], axis=1)

カテゴリカル変数を数値に置き換える

for column in train_x.columns:
    labels, uniques = pd.factorize(train_x[column])
    train_x[column] = labels
for column in test_x.columns:
    labels, uniques = pd.factorize(test_x[column])
    test_x[column] = labels

線形回帰でフィッティング

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(train_x, train_y)

予測

pred_y = regressor.predict(test_x)

CSV作成

submission = pd.DataFrame({'Id':test['Id'], 'SalePrice':pred_y})
submission.to_csv('submission.csv', index=False)
1
1
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1