中部大学新谷研究室 / Shintani Lab, Chubu University

データドリフト vs コンセプトドリフト：何が変わると何が壊れる？（最小実験）

Last updated at 2026-02-14Posted at 2026-02-14

想定読者

MLOpsや運用で「ドリフト監視」が必要と言われたけど、何を見ればいいか曖昧
「データドリフト」「コンセプトドリフト」の違いが腹落ちしていない
CVは良いのに本番で当たらない、原因が分からない

この記事のゴール

用語から丁寧に：データドリフトとコンセプトドリフトを整理
ダミーデータで：“何が変わると何が壊れるか”を図で体験
実務で：監視項目（入力分布 / 性能 / 予測分布）と切り分けの考え方を持ち帰る

TL;DR（結論）

データドリフト（data drift）：入力分布 p(x) が変わる（例：ユーザー層の変化、季節性、装置更新）
- 入力分布の監視（PSI/KS等）で検知しやすい
- ただし 性能が落ちるとは限らない（落ちない場合もある）
コンセプトドリフト（concept drift）：関係 p(y|x)（=「ルール」）が変わる（例：定義変更、環境変化で因果が変わる）
- 入力分布が変わらなくても起きる
- 性能が落ちるが、ラベルが無いと気づきにくい（遅れて発覚しがち）

この記事の最小実験では、次が同時に起きるのを再現します：

データドリフトのみ：PSI↑（入力変化が見える） / AUCはほぼ維持（壊れないこともある）
コンセプトドリフトのみ：PSIほぼ一定（入力は同じ） / AUC↓（壊れる）
両方：PSI↑ / AUC↓

1. 用語説明

データドリフト（Data drift）

「入力（特徴量）側の分布が変わる」こと。代表例は Covariate shift（p(x)が変わる）。

例：年齢層が変わった、装置が変わった、季節でセンサー値が変わる、など

コンセプトドリフト（Concept drift）

「入力→出力の関係（ルール）が変わる」こと。p(y|x)が変わる。

例：ビジネスルール変更、症例の治療方針変更、計測系の仕様変更で意味が変わる、など

ざっくり比較表（何が変わる？何が壊れる？）

観点	データドリフト	コンセプトドリフト
何が変わる？	`p(x)`（入力分布）	`p(y｜x)`（ルール）
入力分布監視（PSI/KS）	反応しやすい	反応しないことがある
性能（AUC/誤差）	落ちることも、落ちないことも	落ちやすい
発見のしやすさ	ラベル無しでも比較的可能	ラベルが必要になりがち（遅れる）
典型対策	追加データで再学習、特徴量の見直し	ルール変更の把握、再学習、運用の見直し

2. 最小実験の設計（何をどう再現する？）

2次元の分類問題を作る

入力：x1, x2
ラベル：y（0/1）

「真のルール（本当の世界）」は、ロジスティック関数で作ります：

p(y=1|x) = sigmoid( scale * (w · x + b) )

この w が「境界の向き（ルール）」です。

4つのシナリオを、日ごとに作る

no_drift：何も変わらない
data_drift：p(x)だけ変わる（平均がじわじわ移動）
concept_drift：p(y|x)だけ変わる（境界がじわじわ回転）
both：両方変わる

監視する2つの指標

入力ドリフト指標（PSI）：p(x)の変化を見る
性能（ROC-AUC）：モデルが当たっているか（※ラベルが必要）

3. Google Colab 実行コード（コピペでOK）

3.1 import

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score, accuracy_score, log_loss

3.2 便利関数（sigmoid / 回転 / データ生成）

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def rotate(v, theta):
    """2Dベクトルvを角度thetaだけ回転"""
    c, s = np.cos(theta), np.sin(theta)
    R = np.array([[c, -s],[s, c]])
    return R @ v

def generate_batch(n, mu, w, b=0.0, scale=1.5, seed=0):
    """
    X ~ N(mu, I)
    y ~ Bernoulli(sigmoid(scale*(w·x + b)))
    """
    rng = np.random.default_rng(seed)
    X = rng.normal(loc=mu, scale=1.0, size=(n, 2))
    logits = scale * (X @ w + b)
    p = sigmoid(logits)
    y = (rng.random(n) < p).astype(int)
    return X, y

3.3 PSI（Population Stability Index）を実装（最小）

PSIは「学習時の分布」と「現在の分布」を、同じビン（ここでは学習データの分位点）で比較する指標です。

ざっくり：PSIが大きいほど分布変化が大きい
目安として 0.1 や 0.25 を境にすることもありますが、データサイズや用途で変わるので過信は禁物です。

def psi_1d(train, current, bins=10, eps=1e-8):
    # 学習データの分位点でビン境界を作る
    qs = np.linspace(0, 1, bins + 1)
    cuts = np.quantile(train, qs)
    cuts[0] = -np.inf
    cuts[-1] = np.inf

    # 境界が重複して単調でなくなるのを回避（安全策）
    for i in range(1, len(cuts)):
        if cuts[i] <= cuts[i-1]:
            cuts[i] = cuts[i-1] + 1e-6

    exp_counts, _ = np.histogram(train, bins=cuts)
    act_counts, _ = np.histogram(current, bins=cuts)

    exp = exp_counts / exp_counts.sum()
    act = act_counts / act_counts.sum()

    exp = np.clip(exp, eps, None)
    act = np.clip(act, eps, None)

    return np.sum((act - exp) * np.log(act / exp))

def psi_mean_over_features(X_ref, X_cur, bins=10):
    psis = [psi_1d(X_ref[:, j], X_cur[:, j], bins=bins) for j in range(X_ref.shape[1])]
    return float(np.mean(psis)), psis

4. 学習（day0）でモデルを作る

# 再現性
rng = np.random.default_rng(0)

# 真のルール（day0）
w0 = np.array([1.0, -1.0])
b0 = 0.0
scale = 1.5

# 学習データ（day0）
n_train_total = 6000
X0, y0 = generate_batch(n_train_total, mu=np.array([0.0, 0.0]), w=w0, b=b0, scale=scale, seed=1)

# train / reference split（PSIの基準や初期性能確認に使う）
X_train, X_ref, y_train, y_ref = train_test_split(X0, y0, test_size=0.3, random_state=0, stratify=y0)

# モデル（標準化 + ロジスティック回帰）
model = Pipeline([
    ("scaler", StandardScaler()),
    ("logreg", LogisticRegression(max_iter=2000))
])
model.fit(X_train, y_train)

# 初期性能（参考）
p_ref = model.predict_proba(X_ref)[:, 1]
print("Initial AUC on ref:", roc_auc_score(y_ref, p_ref))
print("Initial ACC on ref:", accuracy_score(y_ref, (p_ref >= 0.5).astype(int)))
print("Initial logloss on ref:", log_loss(y_ref, p_ref))

5. 「日ごとのデータ」を作って、PSIとAUCを追う（最小実験）

データドリフト：平均 mu がじわじわ動く（p(x)が変わる）
コンセプトドリフト：境界ベクトル w がじわじわ回転（p(y|x)が変わる）

T = 30        # 日数
n_day = 2000  # 1日あたりの評価データ数

delta_mu = 0.05       # データドリフトの強さ（平均移動の速度）
theta_max = np.pi/3   # コンセプトドリフトの強さ（最終日の回転角 = 60度）

records = []

for day in range(T):
    # コンセプトドリフト用：徐々に回転
    theta = theta_max * day / (T - 1)
    w_day = rotate(w0, theta)

    # データドリフト用：平均が徐々に移動
    mu_day = np.array([delta_mu * day, 0.0])

    scenarios = {
        "no_drift":      (np.array([0.0, 0.0]), w0),
        "data_drift":    (mu_day, w0),
        "concept_drift": (np.array([0.0, 0.0]), w_day),
        "both":          (mu_day, w_day),
    }

    for name, (mu, w_true) in scenarios.items():
        X, y = generate_batch(n_day, mu=mu, w=w_true, b=b0, scale=scale, seed=1000 + 10*day + list(scenarios.keys()).index(name))
        p = model.predict_proba(X)[:, 1]

        auc = roc_auc_score(y, p)
        psi_mean, _ = psi_mean_over_features(X_train, X)

        records.append({
            "scenario": name,
            "day": day,
            "auc": auc,
            "psi": psi_mean,
        })

df = pd.DataFrame(records)
display(df.head())

6. 図を作る（3枚）

図1：入力ドリフト（PSI）の時間推移

pivot_psi = df.pivot(index="day", columns="scenario", values="psi")

plt.figure(figsize=(7.2, 4.6))
for sc in pivot_psi.columns:
    plt.plot(pivot_psi.index, pivot_psi[sc], label=sc)
plt.xlabel("day")
plt.ylabel("PSI (mean over features)")
plt.title("Data drift signal (PSI) over time")
plt.legend()
plt.tight_layout()
plt.savefig("fig1_psi_over_time.png", dpi=200)
plt.show()

print("saved: fig1_psi_over_time.png")

図2：モデル性能（ROC-AUC）の時間推移

pivot_auc = df.pivot(index="day", columns="scenario", values="auc")

plt.figure(figsize=(7.2, 4.6))
for sc in pivot_auc.columns:
    plt.plot(pivot_auc.index, pivot_auc[sc], label=sc)
plt.xlabel("day")
plt.ylabel("ROC-AUC")
plt.ylim(0.5, 1.0)
plt.title("Model performance over time")
plt.legend()
plt.tight_layout()
plt.savefig("fig2_auc_over_time.png", dpi=200)
plt.show()

print("saved: fig2_auc_over_time.png")

図3：同じ“見た目”でも、壊れ方が違う（最終日の散布図＋境界）

ポイント：

data_drift：点群が移動する（PSI↑）が、真の境界（solid）とモデル境界（--）はほぼ一致 → 性能は維持しやすい
concept_drift：点群は同じ（PSIほぼ一定）なのに、真の境界が回転してモデル境界とズレる → 性能が落ちる

# モデル境界を「元のx空間」で描くために係数を変換（標準化を戻す）
scaler = model.named_steps["scaler"]
lr = model.named_steps["logreg"]
w_scaled = lr.coef_.ravel()
b_scaled = lr.intercept_[0]
w_model = w_scaled / scaler.scale_
b_model = b_scaled - np.sum(w_scaled * scaler.mean_ / scaler.scale_)

# 最終日の真のルール
day = T - 1
theta = theta_max * day / (T - 1)
w_true_concept = rotate(w0, theta)
mu_day = np.array([delta_mu * day, 0.0])

# 描画用データ（最終日）
plot_scenarios = [
    ("no_drift",      np.array([0.0, 0.0]), w0,              5000),
    ("data_drift",    mu_day,              w0,              5001),
    ("concept_drift", np.array([0.0, 0.0]), w_true_concept,  5002),
    ("both",          mu_day,              w_true_concept,  5003),
]

data_plot = {}
for name, mu, w_true, seed in plot_scenarios:
    Xp, yp = generate_batch(1200, mu=mu, w=w_true, b=b0, scale=scale, seed=seed)
    data_plot[name] = (Xp, yp, w_true)

# 直線：w1*x1 + w2*x2 + b = 0 -> x2 = -(w1*x1 + b)/w2
xlim = (-6, 6)
ylim = (-6, 6)
x1_vals = np.linspace(xlim[0], xlim[1], 200)

fig, axes = plt.subplots(2, 2, figsize=(10, 10), sharex=True, sharey=True)
axes = axes.ravel()

for ax, (name, (Xp, yp, w_true)) in zip(axes, data_plot.items()):
    ax.scatter(Xp[:, 0], Xp[:, 1], c=yp, s=10, alpha=0.6, cmap="coolwarm", vmin=0, vmax=1)

    x2_model = -(w_model[0] * x1_vals + b_model) / w_model[1]
    x2_true  = -(w_true[0]  * x1_vals + b0) / w_true[1]

    ax.plot(x1_vals, x2_model, linestyle="--", linewidth=2, label="model boundary")
    ax.plot(x1_vals, x2_true,  linestyle="-",  linewidth=2, label="true boundary")

    ax.set_title(name)
    ax.set_xlim(xlim); ax.set_ylim(ylim)
    ax.set_xlabel("x1"); ax.set_ylabel("x2")
    ax.grid(alpha=0.2)

handles, labels = axes[0].get_legend_handles_labels()
fig.legend(handles, labels, loc="upper right")
fig.suptitle(
    f"Day {day}: data drift vs concept drift\npoints colored by y, -- model boundary (trained at day0), solid true boundary",
    y=0.98
)
fig.tight_layout(rect=[0, 0, 1, 0.95])
fig.savefig("fig3_scatter_boundaries.png", dpi=200)
plt.show()

print("saved: fig3_scatter_boundaries.png")

7. 結果の読み方

パターンA：PSI↑ だが AUCが落ちない（データドリフトっぽい）

入力が変わっているのは事実
でも「ルール自体」は変わっていないので、性能が落ちないこともある
→ “今すぐ壊れた”ではないが、リスクが上がった状態

実務の動き：

データのカバレッジ確認（学習域外になっていないか）
OOD検知や予測区間を合わせて監視
“ドリフトが一定以上続いたら再学習”など運用ルールを決める

パターンB：PSIほぼ一定なのに AUC↓（コンセプトドリフトっぽい）

入力分布は同じに見える
なのに性能が落ちる → ルール（p(y|x)）が変わった可能性