@_kuma(くま)Team Coffeein

株式会社クイック

PHPStan×GitHub Pagesで始める『コードベース健康診断』― 認知複雑度を定点観測して負債を可視化するダッシュボード構築

Last updated at 2026-06-17Posted at 2026-06-17

PHPStan の認知的複雑度プラグインで全体を定点計測し、GitHub Pages にチーム共有ダッシュボードを作った

はじめに

先日こちらの記事で、PRで変更したメソッドの認知複雑度だけをCIで可視化する仕組みを作りました。

既存の負債はノイズにしない。PHPStan×GitHub Actionsで『今触ったコード』の認知複雑度だけを可視化する

PR単位のフィードバックは「今書いたコードが複雑すぎないか」をその場で知るのに有効です。ただ、それだけでは全体がどう推移しているかが見えず、チームでの共通認識が作りにくいです。

全体の違反件数は増えているのか、減っているのか
どのリポジトリの、どの関数が特に複雑か
計測のたびに比較できる形で蓄積されているか

この記事では、定期的に全体を計測して推移を GitHub Pages で見られるようにした話をします。

全体像

PHPStan 実行
    ↓ report.json
parse_report.py（集計・severity 分類）
    ↓ stats JSON
update_history.py（history.json / history.js に蓄積）
    +
generate_html.py（詳細 HTML 生成）
    ↓
docs/cognitive/
  ├── index.html         # ダッシュボード（推移グラフ + サマリー）
  ├── app.html           # メインアプリ 詳細テーブル
  ├── app-sub.html       # サブアプリ 詳細テーブル
  └── data/
      ├── history.json   # 時系列データ（機械可読）
      └── history.js     # 時系列データ（ブラウザ用）

Python スクリプト3本と静的 HTML で構成されています。外部サービスへの依存はゼロで、GitHub Pages（docs/ フォルダ公開）だけで動きます。

プラグインのインストール

認知的複雑度の計測には tomasvotruba/cognitive-complexity を使います。PHPStan の拡張として動作し、設定ファイルを1行追加するだけで既存のプロジェクトに組み込めます。作者の Tomas Votruba 氏に感謝。

composer require --dev tomasvotruba/cognitive-complexity

PHPStan の設定

phpstan-metrics.neon という専用設定ファイルを用意して、通常の型チェックと分離します。

includes:
    - phpstan.neon
    - vendor/tomasvotruba/cognitive-complexity/config/extension.neon

parameters:
    cognitive_complexity:
        class: 60
        function: 10

閾値は class: 60 / function: 10 に設定しています。PR 単位の CI チェック（phpstan-ci-cognitive.neon）と同じ値にしておくと、全体計測とCIの基準が一致します。

計測の実行

Docker 環境で PHPStan を実行し、結果を JSON で出力します。

docker exec composer bash -c "
  cd /app
  vendor/bin/phpstan analyse \
    -c phpstan-metrics.neon \
    --error-format=json \
    --no-progress \
    --memory-limit=1G \
    app/ > storage/cognitive/report.json
" || true

|| true を付けているのは、PHPStan が違反検出時に終了コード 1 を返すためです。

report.json の解析（parse_report.py）

PHPStan の JSON 出力には型チェック違反など他のエラーも混在しているため、Python で認知的複雑度メッセージだけを抽出します。

import json, re, sys

def severity(score, limit):
    ratio = score / limit
    if ratio >= 3.0: return "critical"
    if ratio >= 2.0: return "high"
    return "medium"

def parse(report_path):
    data = json.load(open(report_path))
    files = data.get("files", {})

    total_violations = 0
    affected_files = set()
    function_critical = function_high = function_medium = 0
    class_critical = class_high = class_medium = 0
    function_violations = class_violations = 0

    for path, info in files.items():
        has_cog = False
        for msg in info.get("messages", []):
            text = msg.get("message", "")
            m_func = re.match(
                r'Cognitive complexity for ".+" is (\d+), keep it under (\d+)', text
            )
            m_class = re.match(
                r'Class cognitive complexity is (\d+), keep it under (\d+)', text
            )
            if m_func:
                score, limit = int(m_func.group(1)), int(m_func.group(2))
                sev = severity(score, limit)
                total_violations += 1
                function_violations += 1
                has_cog = True
                if sev == "critical":   function_critical += 1
                elif sev == "high":     function_high += 1
                else:                   function_medium += 1
            elif m_class:
                score, limit = int(m_class.group(1)), int(m_class.group(2))
                sev = severity(score, limit)
                total_violations += 1
                class_violations += 1
                has_cog = True
                if sev == "critical":   class_critical += 1
                elif sev == "high":     class_high += 1
                else:                   class_medium += 1
        if has_cog:
            affected_files.add(path)

    print(json.dumps({
        "total_violations": total_violations,
        "affected_files": len(affected_files),
        "function_critical": function_critical,
        "function_high": function_high,
        "function_medium": function_medium,
        "class_critical": class_critical,
        "class_high": class_high,
        "class_medium": class_medium,
        "function_violations": function_violations,
        "class_violations": class_violations,
    }))

if __name__ == "__main__":
    parse(sys.argv[1])

実行すると次のような JSON が stdout に出ます。

{
  "total_violations": 163,
  "affected_files": 89,
  "function_critical": 14,
  "function_high": 21,
  "function_medium": 98,
  "class_critical": 3,
  "class_high": 5,
  "class_medium": 19,
  "function_violations": 133,
  "class_violations": 27
}

severity は「スコア ÷ 閾値」の比率で分類する

固定スコアで分類すると、閾値が異なるリポジトリ間で基準がズレます。ratio = score / limit を使うことで、複数リポジトリを横断比較しても一貫した重大度になります。

ratio	severity
≥ 3.0	Critical
≥ 2.0	High
< 2.0	Medium

閾値はメッセージから動的に取得する

設定ファイルをハードコードしなくても、PHPStan のメッセージに "keep it under N" が含まれているため、そこから limit を読み取れます。設定変更しても Python 側の修正が不要です。

m_func = re.match(
    r'Cognitive complexity for ".+" is (\d+), keep it under (\d+)', text
)
score, limit = int(m_func.group(1)), int(m_func.group(2))  # NEON の値を自動追従

履歴の蓄積（update_history.py）

update_history.py が stats JSON を受け取り、history.json を更新します。

import json, sys, datetime, os

HISTORY_PATH = "docs/cognitive/data/history.json"
HISTORY_JS_PATH = "docs/cognitive/data/history.js"

def main():
    args = sys.argv[1:]
    repo = args[0]
    insight = None
    if "--insight" in args:
        idx = args.index("--insight")
        insight = args[idx + 1]

    stats = json.loads(sys.stdin.read())
    today = datetime.date.today().isoformat()

    if os.path.exists(HISTORY_PATH):
        data = json.load(open(HISTORY_PATH))
    else:
        data = {"schema_version": 1, "entries": []}

    entries = data.get("entries", [])
    new_entry = {"date": today, "repo": repo, **stats}
    if insight:
        new_entry["insight"] = insight

    # date + repo をキーにして同日再計測は上書き
    replaced = False
    for i, e in enumerate(entries):
        if e.get("date") == today and e.get("repo") == repo:
            entries[i] = new_entry
            replaced = True
            break
    if not replaced:
        entries.append(new_entry)

    entries.sort(key=lambda e: (e["date"], e["repo"]))
    data["entries"] = entries

    with open(HISTORY_PATH, "w") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)

    # history.js も同時に書き出す（JSON-as-JS パターン）
    js_content = "window.historyData = " + json.dumps(data, indent=2, ensure_ascii=False) + ";\n"
    with open(HISTORY_JS_PATH, "w") as f:
        f.write(js_content)

if __name__ == "__main__":
    main()

使い方はパイプで繋ぐだけです。

python3 parse_report.py report.json \
  | python3 update_history.py app \
  --insight "初回計測。Function Critical 14件が優先対応対象。"

JSON-as-JS パターンで CORS を回避する

静的 HTML から fetch('data/history.json') で読み込もうとすると、file:// プロトコルでは CORS エラーになります。GitHub Pages 上では問題ないのですが、ローカルで open index.html するだけで確認できると便利です。

そこで history.js も同時に出力し、グローバル変数に代入するパターンを使います。

// history.js（生成ファイル）
window.historyData = {
  "schema_version": 1,
  "entries": [...]
};

<!-- index.html -->
<script src="data/history.js"></script>
<script>
  const entries = window.historyData.entries;
</script>

file:// でも https:// でも動作します。

詳細 HTML の生成（generate_html.py）

generate_html.py が report.json から関数・クラス単位のテーブルを持つ HTML を生成します。

severity → クラス・ラベルの変換

def severity_class(score, limit):
    ratio = score / limit
    if ratio >= 3: return "sev-critical"
    if ratio >= 2: return "sev-high"
    return "sev-medium"

def severity_label(score, limit):
    ratio = score / limit
    if ratio >= 3: return "Critical"
    if ratio >= 2: return "High"
    return "Medium"

テーブル行の生成

スコアを最大値で正規化してインラインバーを描画します。

def make_rows_html(rows_list):
    max_score = max(r["score"] for r in rows_list)
    out = []
    for r in rows_list:
        sc = severity_class(r["score"], r["limit"])
        sl = severity_label(r["score"], r["limit"])
        ratio = round(r["score"] / r["limit"], 1)
        bar_width = min(100, int(r["score"] / max_score * 100))
        out.append(
            f'<tr class="{sc}" data-score="{r["score"]}" data-type="{r["type"]}">'
            f'<td>{escape(r["path"])}:{r["line"]}</td>'
            f'<td>{escape(r["name"])}</td>'
            f'<td><span class="score-num">{r["score"]}</span>'
            f'<div class="bar-bg"><div class="bar-fill {sc}" style="width:{bar_width}%"></div></div></td>'
            f'<td>{r["limit"]}</td>'
            f'<td>{ratio}&times;</td>'
            + '<td>' + ('<span class="badge ' + sc + '">' + sl + '</span>' if sl else '') + '</td>'
            + '</tr>'
        )
    return "\n".join(out)

ソート・フィルタはすべてクライアント側

外部依存なしの Vanilla JS で実装しています。数値カラム（ratio）は parseFloat、テキストカラムは localeCompare でソートを分けることで、日本語ファイルパスも正しくソートされます。

ダッシュボード（index.html + Chart.js）

index.html は window.historyData をもとにグラフとサマリーカードを描画します。

3本のグラフ

Total Violations（折れ線） — 2リポジトリの推移を同一グラフに
Function violations（積み上げ棒） — Critical / High / Medium をリポジトリごとにスタック
Class violations（積み上げ棒） — 同上

積み上げ棒グラフ（makeStackedChart）

function makeStackedChart(canvasId, labels,
    dsKtCrit, dsKtHigh, dsKtMed,
    dsEntCrit, dsEntHigh, dsEntMed) {
  new Chart(document.getElementById(canvasId), {
    type: 'bar',
    data: {
      labels,
      datasets: [
        { label: 'app Critical',     data: dsKtCrit,  backgroundColor: '#1e3a8a', stack: 'app' },
        { label: 'app High',         data: dsKtHigh,  backgroundColor: '#4f63d2', stack: 'app' },
        { label: 'app Medium',       data: dsKtMed,   backgroundColor: '#a5b4fc', stack: 'app' },
        { label: 'app-sub Critical', data: dsEntCrit, backgroundColor: '#064e3b', stack: 'app-sub' },
        { label: 'app-sub High',     data: dsEntHigh, backgroundColor: '#059669', stack: 'app-sub' },
        { label: 'app-sub Medium',   data: dsEntMed,  backgroundColor: '#6ee7b7', stack: 'app-sub' },
      ]
    },
    options: {
      responsive: true, maintainAspectRatio: false,
      plugins: { legend: { position: 'top', labels: { font: { size: 11 } } } },
      scales: {
        x: { stacked: true, ticks: { font: { size: 10 } } },
        y: { stacked: true, beginAtZero: true, ticks: { font: { size: 10 } } }
      }
    }
  });
}

// 呼び出し
makeStackedChart('chart-func', dates,
  dataByDate(entries, 'app',     'function_critical', dates),
  dataByDate(entries, 'app',     'function_high',     dates),
  dataByDate(entries, 'app',     'function_medium',   dates),
  dataByDate(entries, 'app-sub', 'function_critical', dates),
  dataByDate(entries, 'app-sub', 'function_high',     dates),
  dataByDate(entries, 'app-sub', 'function_medium',   dates),
);

stack に同じ文字列を持つデータセットが積み上がるので、リポジトリ間が横並びになります。

AI コメント機能

各計測エントリに insight フィールドを追加し、前回比のコメントを残せるようにしました。

{
  "date": "2026-06-17",
  "repo": "app",
  "total_violations": 163,
  "function_critical": 14,
  "function_medium": 98,
  "insight": "Function Critical が 2→14 に増加。リファクタリングの優先度を上げることを推奨します。"
}

--insight オプションで渡すだけなので、手動でもスクリプト経由でも使えます。

python3 parse_report.py report.json \
  | python3 update_history.py app \
  --insight "前回比 +5件。Function Critical 変化なし。"

ダッシュボードのサマリーカードにコメントが表示されます。

端から端まで動かすコマンド

両リポジトリをまとめて計測する場合の例です。

# 1. 計測（app / app-sub を順に実行）
for REPO in app app-sub; do
  docker exec composer bash -c "
    cd /app/${REPO}
    vendor/bin/phpstan analyse \
      -c phpstan-metrics.neon \
      --error-format=json \
      --no-progress \
      --memory-limit=1G \
      app/ > storage/cognitive/report.json
  " || true
  cp repos/${REPO}/storage/cognitive/report.json \
     sandbox/_work/cognitive/${REPO}/report.json
done

# 2. 集計 → history 更新
python3 scripts/parse_report.py sandbox/_work/cognitive/app/report.json \
  | python3 scripts/update_history.py app \
  --insight "前回比 ..."

python3 scripts/parse_report.py sandbox/_work/cognitive/app-sub/report.json \
  | python3 scripts/update_history.py app-sub \
  --insight "前回比 ..."

# 3. 詳細 HTML を再生成
python3 scripts/generate_html.py app     sandbox/_work/cognitive/app/report.json     docs/cognitive/app.html
python3 scripts/generate_html.py app-sub sandbox/_work/cognitive/app-sub/report.json docs/cognitive/app-sub.html

まとめ

観点	PR単位（前の記事）	全体定点計測（この記事）
目的	今書いたコードの品質チェック	コードベース全体の健康状態把握
タイミング	PR時に自動	手動実行（任意のタイミング）
出力	PRコメント	HTML ダッシュボード
蓄積	なし	history.json に時系列蓄積

これら2つのアプローチは目的が異なります。PRでのチェックが『目の前のコードを汚さないためのブレーキ』だとすれば、全体計測は『プロジェクト全体の健康状態を把握するコンパス』のようなものです。これらを組み合わせることで、チーム全員が納得感を持ってリファクタリングに取り組めるようになります。

今後は定期自動計測を実装する予定です。手動での更新が不要になれば、より『チーム全員でコードの健全性を眺める文化』が根付くはずです。ぜひ皆さんのプロジェクトでも、この『健康診断ダッシュボード』を導入してみてください。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up