Tech Radar 試してみた #3 — JSON Schema を LLM prompt に変換するツールで Structured output を可視化

Posted at 2026-05-31

Thoughtworks Technology Radar Vol 34 (April 2026) の Adopt 枠に Structured output from LLMs が載っている。「LLM に JSON Schema を渡して schema 通りの JSON を返してもらう」というパターンで、もう Adopt つまり「業界はこれを取り入れるべき」段階。Instructor / Pydantic AI / Outlines などのライブラリが内部で何をしているのかを可視化するツールを 500 行 vanilla JS で書いた。JSON Schema を貼ると (a) LLM 用の自然言語 prompt と (b) 期待される出力例 が出て、さらに (c) 受信した LLM 出力を schema 検証 できる。

🌐 Demo: https://sen.ltd/portfolio/schema-prompt/
📦 GitHub: https://github.com/sen-ltd/schema-prompt

Structured output の二段構造

「LLM に schema を渡す」と一言でいうけど、実装上は 半分ずつ別の責務:

Schema → prompt 変換: LLM はそのまま JSON Schema を読むわけではなく、type, properties, required などを 自然言語の指示 として再構成する必要がある。OpenAI の response_format: { type: "json_object" } API も schema を直接渡せるが、その内側でも同じ変換が走っている。
Output 検証: LLM が schema 通りに返してきたつもりでも、enum 範囲外、minimum 違反、required missing は普通に起きる。境界での検証 で型付きエラーに変換して呼び出し側で処理できるようにする。

ライブラリ (Instructor 等) を使うとこの両方が隠蔽されるが、生 API を叩く場合は自分で書く。本ツールは両方を見える形で提供する。

Schema → prompt の中身

サンプル: sentiment + confidence schema

{
  "type": "object",
  "properties": {
    "sentiment": {
      "type": "string",
      "enum": ["positive", "negative", "neutral"],
      "description": "Overall sentiment of the input text"
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score, 0–1"
    },
    "keywords": {
      "type": "array",
      "items": { "type": "string", "minLength": 1 },
      "maxItems": 5
    }
  },
  "required": ["sentiment", "confidence"]
}

これを変換すると:

Return a JSON object that conforms to the following structure.
Output JSON only — no prose, no code fences.

Fields:
- sentiment: string [one of: ["positive","negative","neutral"]] (required) — Overall sentiment of the input text
- confidence: number [≥ 0, ≤ 1] (required) — Confidence score, 0–1
- keywords: array [max 5 items] (optional)
  - (each item): string [length ≥ 1] (optional)

Use null for fields you cannot determine. Do not invent data.

ポイント:

enum は [one of: ...] で展開 — LLM は JSON Schema 構文を直接は理解しないので、人間が読む形に書き換える
minimum/maximum は [≥ 0, ≤ 1] — 不等号記号は LLM が普通に読める
(required) / (optional) は明示。LLM は required を見落としがちなので分かりやすく
配列の (each item) でネスト構造を表現 — インデントだけで階層を表す
末尾の "Use null for fields you cannot determine. Do not invent data." は 幻覚抑制の定型句。これがないと LLM は埋められないフィールドにそれっぽい値をでっち上げる

実装の核 — `describeProp` の再帰

各プロパティを 1 行で describe する関数を再帰させる:

function describeProp(propName, schema, indent = 0) {
  const pad = "  ".repeat(indent);
  const type = jsonType(schema);
  const required = schema._required === true;
  const tag = required ? " (required)" : " (optional)";
  const desc = schema.description ? ` — ${schema.description}` : "";

  if (type === "object" && schema.properties) {
    const inner = describeObject(schema, indent + 1);
    return `${pad}- ${propName}: object${tag}${desc}\n${inner}`;
  }
  if (type === "array" && schema.items) {
    const arrayConstraint = formatConstraints(schema);
    const itemDesc = describeProp("(each item)", schema.items, indent + 1);
    return `${pad}- ${propName}: array${arrayConstraint}${tag}${desc}\n${itemDesc}`;
  }
  const constraint = formatConstraints(schema);
  return `${pad}- ${propName}: ${type}${constraint}${tag}${desc}`;
}

object と array は再帰、それ以外は constraint と type をまとめて 1 行で出す。required は親側から子へ _required 属性で伝搬する (JSON Schema 仕様では required はオブジェクトレベルにあるので、再帰中に持ち回す必要がある)。

実装中に踏んだバグ: array の constraint (minItems, maxItems) を含め忘れ ていた。最初のバージョンは array branch で formatConstraints を呼ばずに items の中身だけ describe していて、[max 5 items] が消えていた。テストで早く拾えた:

test("array bounds", () => {
  const out = buildPrompt({
    type: "object",
    properties: {
      tags: { type: "array", items: { type: "string" }, minItems: 1, maxItems: 5 },
    },
  });
  assert.match(out, /min 1 items/);
  assert.match(out, /max 5 items/);
});

「再帰の各ブランチで同じ責務 (constraint format) を呼ぶ」を unit テストで均一性を確認する典型的なパターン。

Example output の合成

prompt だけでなく 「期待される JSON 形状」のサンプル も生成する。LLM への few-shot anchor として使える:

export function buildExample(schema) {
  return JSON.stringify(synthesize(schema), null, 2);
}

function synthesize(schema) {
  if (schema.const !== undefined) return schema.const;
  if (schema.enum) return schema.enum[0];
  const type = jsonType(schema);
  if (type === "object" && schema.properties) {
    const out = {};
    for (const [name, sub] of Object.entries(schema.properties)) {
      out[name] = synthesize(sub);
    }
    return out;
  }
  if (type === "array" && schema.items) {
    return [synthesize(schema.items)];
  }
  if (type === "string") return schema.format ? `<${schema.format}>` : "<string>";
  if (type === "number" || type === "integer") return 0;
  if (type === "boolean") return false;
  return null;
}

format: "email" なら "<email>"、enum: ["a", "b"] なら最初の "a"、それ以外は型ごとの placeholder。実値ではなく shape を伝えるためのサンプル なので、<email> のような placeholder のままが正解。

Output validator — LLM 出力境界の防衛線

LLM は schema を見せても しれっと違反する:

enum を無視して新しい値を作る ("happy" を返す、schema が ["positive", "negative", "neutral"] でも)
required フィールドを忘れる
number で minimum: 0 を -0.3 で返す (negative confidence のように物理的にあり得ない値)
数値型に string を入れる (confidence: "high")

これを境界で全部捕まえないと、ダウンストリームの型システムが「死んだ」状態のデータで動き始める。validator は path 付きエラーを出す:

const errs = validate(schema, { sentiment: "happy", confidence: 1.5 });
// → [
//   { path: "$.sentiment", message: 'must be one of ["positive","negative","neutral"]' },
//   { path: "$.confidence", message: "1.5 > maximum 1" },
// ]

$.sentiment のような JSONPath 風表記でネストもサポート ($.user.profile.email 等)。LLM の retry / error feedback ループに渡せる形。

JSON Schema のサブセット実装

全 Draft 7/2020-12 を実装すると数千行になるが、LLM workflow で実際に使う構文に絞る:

実装:

type (string, number, integer, boolean, array, object, null + union)
properties + required
items + minItems / maxItems
enum / const
minimum / maximum
minLength / maxLength
pattern (regex)
format (hint only — 検証はしない、LLM への指示として表示する)

意図的に省いた:

oneOf / anyOf / allOf — LLM 出力にはほぼ出現しない
$ref — schema 内参照は別途展開が必要、複雑度の割に LLM 用途では効果薄
additionalProperties — LLM は要求していないフィールドを返さないので、強制チェックは過剰

実装サイズ: validator 約 110 行。LLM 用途では十分。

アーキテクチャ

prompt.js    ← Schema → LLM prompt + example synthesizer (15 tests)
validate.js  ← JSON Schema validator subset (18 tests)
presets.js   ← 5 種の real-world schema + 対応サンプル
app.js       ← UI グルー（schema → prompt + example、output → validation）

prompt.js も validate.js も DOM 非依存。Node テストで 33 件全部通してから UI を組む。

5 preset:

sentiment + confidence: 古典的 NLP タスク
address parser: 住所文字列の structured 抽出
meeting summary: タイトル + 参加者 + action items 配列 + decisions
entity extraction: text + type + start/end 位置
product extraction: 商品説明から name / category / price / stock / tags

どれも本番ワークフローで実際に出てくる shape。

まとめ

Structured output from LLMs は schema → prompt 変換 + 出力 validation の 二段構造
prompt 側は enum / min,max / required / description などを 自然言語の指示に再構成
末尾の "Use null. Do not invent data." で幻覚を抑制
example output の合成は schema の shape を伝える ためで実値は不要 (<email> 等の placeholder で OK)
validator は 境界で path 付きエラー を出すと retry / feedback loop に渡せる
JSON Schema 全部実装する必要はない、LLM workflow が使うサブセット で十分

リポジトリ: https://github.com/sen-ltd/schema-prompt

このツールは弊社の OSS ポートフォリオ #249 として作成しました。Tech Radar 試してみた シリーズ第 3 弾。前回は #248 Markdown → Typst、その前が #247 TOON コンバータ。次回は Server-driven UI 予定。SEN 合同会社（東京）では小さくて切れ味のあるツール群を継続的に公開しています: https://sen.ltd/portfolio/

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up