0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

matminerでマテリアルズインフォマティックス(3) XgBoostで鋼材の強度予測

Last updated at Posted at 2025-04-19

強度予測の方法は・・

強度予測した結果をSHAPを使って、結果の解釈をしたいので、決定木系のアンサンブルであるXgBoostで降伏強度を予測します。
データは前処理して、降伏強度と元素組成のデータフレームになっています。

yield strength Fe C Mn Si Cr Ni
2411.5 0.62 0.000953 0.000521 0.00102 0.00011 0.192
1736.3 0.623 0.00854 0.000104 0.000203 0.147 9.71E-05

生成AI(Gemini)で予測プログラムを作成する

Geminiで予測プログラムを作成しました。指示は、こんな感じのお手軽な命令です。

yield strengthをyに、それ以外をxに入れて、yを予測する機械学習を行う。アルゴリズムはXgBoostです。ハイパーパラメーターはoptunaで最適化してください。結果をグラフに表してください。

この命令だけで、このプログラムが出来上がります。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import xgboost as xgb
import optuna

# Prepare the data
X = df.drop(columns="yield strength")
y = df["yield strength"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the objective function for Optuna
def objective(trial):
    params = {
        "n_estimators": trial.suggest_int("n_estimators", 100, 1000),
        "max_depth": trial.suggest_int("max_depth", 3, 10),
        "learning_rate": trial.suggest_float("learning_rate", 0.01, 0.3),
        "subsample": trial.suggest_float("subsample", 0.5, 1.0),
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.5, 1.0),
        "min_child_weight": trial.suggest_int("min_child_weight", 1, 10),
        "objective": "reg:squarederror",  # Specify the objective function
        "random_state": 42
    }
    model = xgb.XGBRegressor(**params)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    return rmse

# Run Optuna optimization
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=200)  # Adjust n_trials as needed

# Train the model with the best hyperparameters
best_params = study.best_params
best_model = xgb.XGBRegressor(**best_params)
best_model.fit(X_train, y_train)

# Make predictions
y_pred_train = best_model.predict(X_train)
y_pred = best_model.predict(X_test)

# Evaluate the model
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"Best RMSE: {rmse}")
print(f"Best Hyperparameters: {best_params}")

# Plot the results
plt.figure(figsize=(8, 6))
plt.scatter(y_train, y_pred_train, alpha=0.5)
plt.scatter(y_test, y_pred, alpha=0.5)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')  # Add a diagonal line
plt.xlabel("Actual Yield Strength")
plt.ylabel("Predicted Yield Strength")
plt.title("Actual vs. Predicted Yield Strength (XGBoost)")
plt.grid(True)
plt.show()

optunaの繰り返し回数のみ増やしていますが、他は基本的にデフォルトです。
実行すると、こういうグラフが出てきます。
予測グラフ.png

見事に予測できています。
R2 (Training Data): 0.971
R2 (Testing Data) : 0.845

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?