train_test_splitの使い方

Python
ML
scikit-learn

Posted at 2026-01-22

機械学習タスクを開始する際のデータ分割方法についてです。
機械学習タスクを進める際に、trainとtestの分け方,順番をミスる場合が多いので見返す用に書きました。

pip install scikit-learn

from sklearn.model_selection import train_test_split

# X: 特徴量, y: ターゲット変数
# test_size: テストデータの割合 (0.2 = 20%)
# random_state: 実行するたびに結果が変わらないよう固定するシード値
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=42
)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up