4 択クイズの質は「不正解の選択肢」で決まる — 四字熟語クイズで distractor 自動生成アルゴリズムを実装する

Posted at 2026-06-10

4 択クイズアプリを作るとき、一番手を抜きがちなのが不正解の選択肢 (distractor) の選び方。正解以外からランダムに 3 つ選ぶ実装が大半だが、それだと消去法で当てられてしまう。「切磋琢磨の意味は？」という問題で、選択肢に努力系の意味が 1 つしか無ければ、熟語を知らなくても正解できる。いい問題 = 紛らわしい不正解。四字熟語クイズを作りながら、distractor を「漢字共有 + 意味カテゴリ」の優先度スコアで自動生成するアルゴリズムを実装した。RNG 注入で決定的テストも書けるようにした。

🌐 デモ: https://sen.ltd/portfolio/yojijukugo/
📦 GitHub: https://github.com/sen-ltd/yojijukugo

ランダム distractor の何がダメか

クイズの問題生成で素朴にやると:

// ダメな例
const distractors = shuffle(pool.filter(x => x !== answer)).slice(0, 3);

これで生成した問題:

「切磋琢磨」の意味は？
a. 仲間同士で励まし合い、競い合って向上すること。 ← 正解
b. 自然の美しい景色や風流な遊びのこと。 ← 花鳥風月
c. ぐずぐずして物事の決断ができないこと。 ← 優柔不断
d. 周囲が敵ばかりで味方がいないこと。 ← 四面楚歌

「切磋琢磨」が努力系の言葉だと薄っすら知っていれば、b/c/d は全部意味領域が違うので消去できる。問題が熟語の知識ではなく、選択肢の分布を読む能力を測ってしまっている。

紛らわしさの 2 軸

人間が熟語を混同するパターンは大きく 2 つ:

見た目が似ている — 同じ漢字を含む。「一期一会」と「一蓮托生」、「七転八起」と「二転三転」。読み当てモードで特に効く。
意味が似ている — 同じ意味領域。「切磋琢磨」と「臥薪嘗胆」(どちらも努力系)。意味当てモードで効く。

この 2 軸をスコアにする:

export function sharedKanjiCount(a, b) {
  const setA = new Set(a);
  const setB = new Set(b);
  let n = 0;
  for (const ch of setA) {
    if (setB.has(ch)) n++;
  }
  return n;
}

export function rankDistractors(answer, pool = IDIOMS) {
  const others = pool.filter((x) => x.word !== answer.word);

  const withScore = others.map((x) => {
    const kanji = sharedKanjiCount(answer.word, x.word);
    const sameCategory = x.category === answer.category ? 1 : 0;
    // 漢字共有が支配的: 10点/字。同カテゴリは 1点。
    return { entry: x, score: kanji * 10 + sameCategory };
  });

  withScore.sort((a, b) => b.score - a.score);
  return withScore;
}

重み 10 : 1 の意図: 漢字を 1 つでも共有していれば、カテゴリ一致より優先する。「一」を共有する熟語は読みも見た目も混同しやすいから。同カテゴリは tiebreaker として機能する。

データ側は各熟語に意味カテゴリを付けておく:

export const IDIOMS = [
  { word: "切磋琢磨", reading: "せっさたくま", meaning: "仲間同士で励まし合い...", category: "努力" },
  { word: "臥薪嘗胆", reading: "がしんしょうたん", meaning: "目的達成のために...", category: "努力" },
  { word: "花鳥風月", reading: "かちょうふうげつ", meaning: "自然の美しい景色...", category: "自然" },
  // 79 entries, 8 categories
];

サンプリング: 上位 6 から 3 つ

スコア上位 3 をそのまま使うと、同じ問題は毎回同じ選択肢になる。リプレイ性のために上位 6 件から 3 件をサンプリング:

export function generateQuestion(answer, opts = {}) {
  const ranked = rankDistractors(answer, pool);
  let picked;
  if (difficulty === "hard") {
    const top = ranked.slice(0, 6).map((r) => r.entry);
    picked = shuffle(top, rng).slice(0, 3);
  } else {
    // easy: 下位半分からランダム
    const bottom = ranked.slice(Math.floor(ranked.length / 2)).map((r) => r.entry);
    picked = shuffle(bottom, rng).slice(0, 3);
  }
  // ...
}

「やさしい」モードは下位半分 (= 漢字もカテゴリも遠い熟語) から選ぶ。難易度調整が distractor の選択範囲そのものになっている。

実際に生成された hard モードの問題:

「急転直下」の読みは？
a. ういてんぺん ← 有為転変 (「転」共有)
b. きゅうてんちょっか ← 正解
c. たんとうちょくにゅう ← 単刀直入 (「直」共有)
d. しんきいってん ← 心機一転 (「転」共有)

全選択肢が正解と漢字を共有していて、「てん」「ちょく」の音も重なる。ランダム選択では絶対に出ない密度。

RNG 注入で決定的テスト

クイズ生成は本質的にランダムだが、テストは決定的であってほしい。Math.random を直接呼ばず、RNG を引数で注入する:

// Mulberry32 — 32-bit seed の小さい決定的 PRNG
export function mulberry32(seed) {
  let s = seed >>> 0;
  return function () {
    s = (s + 0x6d2b79f5) >>> 0;
    let t = s;
    t = Math.imul(t ^ (t >>> 15), t | 1);
    t ^= t + Math.imul(t ^ (t >>> 7), t | 61);
    return ((t ^ (t >>> 14)) >>> 0) / 4294967296;
  };
}

export function shuffle(arr, rng = Math.random) {
  const a = [...arr];
  for (let i = a.length - 1; i > 0; i--) {
    const j = Math.floor(rng() * (i + 1));
    [a[i], a[j]] = [a[j], a[i]];
  }
  return a;
}

テストはシード固定で再現可能:

test("deterministic with same seed", () => {
  const q1 = generateQuestion(answer, { rng: mulberry32(5) });
  const q2 = generateQuestion(answer, { rng: mulberry32(5) });
  assert.deepEqual(q1.choices.map((c) => c.key), q2.choices.map((c) => c.key));
});

「hard は本当に hard か」を統計でテストする

面白いのがこのテスト。「hard モードの distractor は easy モードより上位ランクから選ばれている」ことを、30 シード分の平均ランクで検証する:

test("hard difficulty picks higher-ranked distractors than easy", () => {
  const ranked = rankDistractors(answer).map((r) => r.entry.word);
  const rankOf = (word) => ranked.indexOf(word);

  let hardSum = 0, easySum = 0, n = 0;
  for (let seed = 0; seed < 30; seed++) {
    const qh = generateQuestion(answer, { difficulty: "hard", rng: mulberry32(seed) });
    const qe = generateQuestion(answer, { difficulty: "easy", rng: mulberry32(seed + 1000) });
    for (const c of qh.choices.filter((c) => !c.correct)) hardSum += rankOf(c.key);
    for (const c of qe.choices.filter((c) => !c.correct)) easySum += rankOf(c.key);
    n++;
  }
  assert.ok(hardSum / n < easySum / n);
});

単発のシードだと偶然 easy の方が難しい選択肢を引く可能性があるので、複数シードの平均で性質を主張する。ランダム性を含むコードのテストパターンとして汎用的。

データの整合性テスト

データ 79 件も「テストで守る」対象:

test("every word is exactly 4 chars", () => {
  for (const x of IDIOMS) {
    assert.equal([...x.word].length, 4, `${x.word} is not 4 kanji`);
  }
});

test("no duplicate words", () => {
  const words = IDIOMS.map((x) => x.word);
  const dupes = words.filter((w, i) => words.indexOf(w) !== i);
  assert.deepEqual(dupes, []);
});

test("readings are pure hiragana", () => {
  for (const x of IDIOMS) {
    assert.match(x.reading, /^[ぁ-ゖー]+$/);
  }
});

実はこのテストに実際に救われた。データ作成時に「四面楚歌」の前にキリル文字が混入 (четы面楚歌)、別の行に英単語が混入 (right唯々諾々)、温故知新を 2 回登録、という 3 つのミスをやらかしていて、4 文字チェックと重複チェックが全部捕まえた。データをコードに埋め込むなら整合性テストはセットで書くべき。

[...x.word].length にしているのは、String.length が UTF-16 code unit 数を返すため。四字熟語にサロゲートペアの漢字 (𠮟など) が入った場合に備えて spread で code point 数を数える。

設計

data.js  ← 79 熟語 ({word, reading, meaning, category})
quiz.js  ← distractor ランキング + 問題生成 + 採点 (DOM-free, 32 tests)
app.js   ← UI glue

quiz.js は完全に DOM 非依存。UI は generateQuiz(10, { mode, difficulty }) を呼んで choices 配列をボタンにするだけ。

試してみる

デモ: https://sen.ltd/portfolio/yojijukugo/
GitHub: https://github.com/sen-ltd/yojijukugo

「むずかしい」モードの読み当てをやってみてほしい。「一」始まりの熟語問題で選択肢が全部「いち◯◯◯◯」になったときの絶望感が、アルゴリズムの成果。

まとめ

4 択クイズの質は distractor で決まる。ランダム選択は消去法で破られる。
紛らわしさは「漢字共有 (見た目) + 同カテゴリ (意味)」の 2 軸でスコア化できる。重みは 10:1 で見た目優先。
スコア上位 6 から 3 をサンプリングするとリプレイ性が出る。難易度 = サンプリング範囲。
RNG は注入可能にする (mulberry32)。テストが決定的になる。
「hard は easy より難しい」のような統計的性質は、複数シードの平均でテストする。
データ埋め込み型アプリはデータ整合性テスト(文字数・重複・文字種) をセットで書く。実際にキリル文字混入を 2 件捕まえた。

これは SEN 合同会社の OSS ポートフォリオ #259 です。https://sen.ltd/portfolio/

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up