LoginSignup
0
1

More than 5 years have passed since last update.

Pythonで機械学習 - Modeling

Last updated at Posted at 2018-02-11

予測モデルの生成

必要なライブラリをインポートします。

import pandas as pd
from sklearn.ensemble import RandomForestClassifier

データファイルを読み込みます。

train = pd.read_csv("train_prep.csv")
test = pd.read_csv("test_prep.csv")
train.head(1)
Unnamed: 0 PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked SexInt AgeFillNa FareFillNa EmbarkedInt NumFamily IsAlone
0 0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.25 NaN S 0 22.0 7.25 0 1 0

学習用データの説明変数(インプット)と目的変数(アウトプット)を用意します。

expvars = ["Pclass","SexInt","AgeFillNa","FareFillNa","IsAlone"] # 説明変数のリスト
X_train = train.copy()[expvars] # 説明変数
Y_train = train["Survived"]# 目的変数

学習データによって学習させ、ランダムフォレストの予測モデルを構築します。

clf = RandomForestClassifier()
clf.fit(X_train, Y_train)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
            verbose=0, warm_start=False)

予測モデルの確認

構築したモデルでテストデータの乗客の生存を予測します。

X_train = test.copy()[expvars] # 説明変数
prediction = clf.predict(X_train) # 生死の判定
pred_proba = clf.predict_proba(X_train) # 生存確率の判定

乗客データと合わせて可視化してみます。

test["PredictedSurvival"] = prediction
test["SurvivalProba"] = pred_proba[:,1]

test[["PassengerId","Pclass","Name","Sex","Age","SibSp","Parch","Fare","Embarked","PredictedSurvival","SurvivalProba"]]
PassengerId Pclass Name Sex Age SibSp Parch Fare Embarked PredictedSurvival SurvivalProba
0 892 3 Kelly, Mr. James male 34.5 0 0 7.8292 Q 0 0.000000
1 893 3 Wilkes, Mrs. James (Ellen Needs) female 47.0 1 0 7.0000 S 0 0.300000
2 894 2 Myles, Mr. Thomas Francis male 62.0 0 0 9.6875 Q 1 0.700000
3 895 3 Wirz, Mr. Albert male 27.0 0 0 8.6625 S 1 0.700000
4 896 3 Hirvonen, Mrs. Alexander (Helga E Lindqvist) female 22.0 1 1 12.2875 S 1 0.700000
5 897 3 Svensson, Mr. Johan Cervin male 14.0 0 0 9.2250 S 0 0.200000
6 898 3 Connolly, Miss. Kate female 30.0 0 0 7.6292 Q 0 0.100000
7 899 2 Caldwell, Mr. Albert Francis male 26.0 1 1 29.0000 S 0 0.000000
8 900 3 Abrahim, Mrs. Joseph (Sophie Halaut Easu) female 18.0 0 0 7.2292 C 1 1.000000
9 901 3 Davies, Mr. John Samuel male 21.0 2 0 24.1500 S 0 0.000000
10 902 3 Ilieff, Mr. Ylio male NaN 0 0 7.8958 S 0 0.200000
11 903 1 Jones, Mr. Charles Cresson male 46.0 0 0 26.0000 S 0 0.350000
12 904 1 Snyder, Mrs. John Pillsbury (Nelle Stevenson) female 23.0 1 0 82.2667 S 1 1.000000
13 905 2 Howard, Mr. Benjamin male 63.0 1 0 26.0000 S 0 0.000000
14 906 1 Chaffee, Mrs. Herbert Fuller (Carrie Constance... female 47.0 1 0 61.1750 S 1 1.000000
15 907 2 del Carlo, Mrs. Sebastiano (Argenia Genovesi) female 24.0 1 0 27.7208 C 1 1.000000
16 908 2 Keane, Mr. Daniel male 35.0 0 0 12.3500 Q 0 0.025000
17 909 3 Assaf, Mr. Gerios male 21.0 0 0 7.2250 C 1 0.800000
18 910 3 Ilmakangas, Miss. Ida Livija female 27.0 1 0 7.9250 S 1 0.600000
19 911 3 Assaf Khalil, Mrs. Mariana (Miriam")" female 45.0 0 0 7.2250 C 0 0.400000
20 912 1 Rothschild, Mr. Martin male 55.0 1 0 59.4000 C 0 0.200000
21 913 3 Olsen, Master. Artur Karl male 9.0 0 1 3.1708 S 0 0.200000
22 914 1 Flegenheim, Mrs. Alfred (Antoinette) female NaN 0 0 31.6833 S 1 0.900000
23 915 1 Williams, Mr. Richard Norris II male 21.0 0 1 61.3792 C 0 0.500000
24 916 1 Ryerson, Mrs. Arthur Larned (Emily Maria Borie) female 48.0 1 3 262.3750 C 1 1.000000
25 917 3 Robins, Mr. Alexander A male 50.0 1 0 14.5000 S 0 0.000000
26 918 1 Ostby, Miss. Helene Ragnhild female 22.0 0 1 61.9792 C 1 1.000000
27 919 3 Daher, Mr. Shedid male 22.5 0 0 7.2250 C 1 0.800000
28 920 1 Brady, Mr. John Bertram male 41.0 0 0 30.5000 S 1 0.800000
29 921 3 Samaan, Mr. Elias male NaN 2 0 21.6792 C 0 0.000000
... ... ... ... ... ... ... ... ... ... ... ...
388 1280 3 Canavan, Mr. Patrick male 21.0 0 0 7.7500 Q 0 0.000000
389 1281 3 Palsson, Master. Paul Folke male 6.0 3 1 21.0750 S 0 0.400000
390 1282 1 Payne, Mr. Vivian Ponsonby male 23.0 0 0 93.5000 S 0 0.100000
391 1283 1 Lines, Mrs. Ernest H (Elizabeth Lindsey James) female 51.0 0 1 39.4000 S 1 0.900000
392 1284 3 Abbott, Master. Eugene Joseph male 13.0 0 2 20.2500 S 0 0.100000
393 1285 2 Gilbert, Mr. William male 47.0 0 0 10.5000 S 0 0.000000
394 1286 3 Kink-Heilmann, Mr. Anton male 29.0 3 1 22.0250 S 0 0.300000
395 1287 1 Smith, Mrs. Lucien Philip (Mary Eloise Hughes) female 18.0 1 0 60.0000 S 1 1.000000
396 1288 3 Colbert, Mr. Patrick male 24.0 0 0 7.2500 Q 0 0.000000
397 1289 1 Frolicher-Stehli, Mrs. Maxmillian (Margaretha ... female 48.0 1 1 79.2000 C 1 1.000000
398 1290 3 Larsson-Rondberg, Mr. Edvard A male 22.0 0 0 7.7750 S 0 0.000000
399 1291 3 Conlon, Mr. Thomas Henry male 31.0 0 0 7.7333 Q 0 0.000000
400 1292 1 Bonnell, Miss. Caroline female 30.0 0 0 164.8667 S 1 1.000000
401 1293 2 Gale, Mr. Harry male 38.0 1 0 21.0000 S 0 0.000000
402 1294 1 Gibson, Miss. Dorothy Winifred female 22.0 0 1 59.4000 C 1 1.000000
403 1295 1 Carrau, Mr. Jose Pedro male 17.0 0 0 47.1000 S 0 0.100000
404 1296 1 Frauenthal, Mr. Isaac Gerald male 43.0 1 0 27.7208 C 0 0.100000
405 1297 2 Nourney, Mr. Alfred (Baron von Drachstedt")" male 20.0 0 0 13.8625 C 0 0.300000
406 1298 2 Ware, Mr. William Jeffery male 23.0 1 0 10.5000 S 0 0.000000
407 1299 1 Widener, Mr. George Dunton male 50.0 1 1 211.5000 C 0 0.500000
408 1300 3 Riordan, Miss. Johanna Hannah"" female NaN 0 0 7.7208 Q 1 1.000000
409 1301 3 Peacock, Miss. Treasteall female 3.0 1 1 13.7750 S 1 0.800000
410 1302 3 Naughton, Miss. Hannah female NaN 0 0 7.7500 Q 1 0.929808
411 1303 1 Minahan, Mrs. William Edward (Lillian E Thorpe) female 37.0 1 0 90.0000 Q 1 1.000000
412 1304 3 Henriksson, Miss. Jenny Lovisa female 28.0 0 0 7.7750 S 1 0.533333
413 1305 3 Spector, Mr. Woolf male NaN 0 0 8.0500 S 0 0.200000
414 1306 1 Oliva y Ocana, Dona. Fermina female 39.0 0 0 108.9000 C 1 1.000000
415 1307 3 Saether, Mr. Simon Sivertsen male 38.5 0 0 7.2500 S 0 0.000000
416 1308 3 Ware, Mr. Frederick male NaN 0 0 8.0500 S 0 0.200000
417 1309 3 Peter, Master. Michael J male NaN 1 1 22.3583 C 0 0.100000

418 rows × 11 columns

戻る

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1