コンピューター側がヒット＆ブローを頑張って八百長するコード

Posted at 2025-08-24

はじめの前のおねがい

できれば「いいね♡」をお願いします。励みになります。

はじめに

免責：このコードは投稿者が統計を学び考察する目的で意図的にコンピューター側が八百長をする仕組みを組み込んだコードです。本コードの使用による損害はいかなる場合も保証しません。本コードを使用したことでユーザーはこの注意書きに同意したとみなします。

注意：通常にゲームを遊ぶ目的の場合はヒット＆ブロー（3桁または4桁の数字を当てるやつ）の記事記載のコードをご利用ください。

このコードは上記のヒット＆ブロー（3桁または4桁の数字を当てるやつ）を作った後、「だったら八百長ゲームの場合は正答率はどうなるの？」という疑問から、副次的に作成することになりました。
このコードはPython 3とPythonista 3に対応しています。

本コードを実行するとどうなるか

基本動作

ヒット＆ブロー — Ultra Adversarial：適応的八百長（整合厳守）（3桁・各桁相異（重複なし））
各桁が異なる3桁を入力してください（例：012）。
コマンド：history / undo / explain / profile / reveal / quit

推測（3桁）＞

上記のプロンプトに

入力画面になるので3桁（4桁指定の時には4桁）のそれぞれ異なる半角数字を入力して下さい
HitとBlowが0H1Bのように表示されるのでその情報を元に次の数字を入力して下さい
コマンドhistoryで履歴、undoでやり直し、revealで降参、quitで終了です

と入力します。

このコードの特徴

本コードはコンピューター側が八百長をします。細かい設定方法は次の章で後述しますが、下記の条件のもとにコンピューター側が、正解の数値を数字を差し替えます。

初手は最も正解率が低くなるように0H1Bになるようにする
※ 0H0Bの方がユーザーは「その数値は必ず当たらない」と判断できるために、コンピューター側が差し替えられる数字が少なくなってしまうためです。
第2手からユーザーの提示した数値をもとに最も確率の低い数字に正解を変更していく
コンピューターの提示したヒットとブローの整合性は守るようにする
※ 八百長なのですからヒットとブローを適当に返し、常に当たらないようにすればいいじゃないか、という件ですが、あくまで本コードは数学的にどうすれば手数が引き延ばせるかが論点ですので、ルールを破る（八百長する）部分は「最初にホスト側が数字を決める」の部分のみです。
ゲームごとのユーザーの「数値の嗜好」を積み重ねていき、不正解率を上げていく

ただしDonald Knuth先生など様々な先行者が、最適な数字の提示の仕方を研究していて、そういった手法のもとに最適に進めていくと、大抵は５(1)手で当たってしまいます。逆を言えばこのコードは5(1)手まではユーザーに正解させないコードということでもあります。

カスタマイズ

ソースコード冒頭の

# 設定（必要なら変更可）
CODE_LENGTH: int = 3                 # 3 または 4（既定 = 3）
ALLOW_DUPLICATES: bool = False       # False: 桁重複 禁止 / True: 許容（既定 = 禁止）
ALLOW_LEADING_ZERO: bool = True      # 先頭 0 を許容
RIG_MODE: bool = True                # True: 八百長する / False: 八百長しない
SECRET_OVERRIDE: str | None = None   # RIG_MODE=False のときのみ有効

RNG_SEED: int | None = None          # None: UTCナノ秒 / int: 再現用
DETERMINISTIC_REPRESENTATIVE: bool = True  # True: lexicographically smallest

# 学習モデル＆先読み（ゲーム内のみ有効）
DECAY: float = 0.995
LAPLACE: float = 1.0

# 先読み規模（速度・堅牢性のトレードオフ）
LOOKAHEAD_K: int = 48                  # 次手予測サンプル数
LOOKAHEAD_MAX_S: int = 6000            # |S| がこれを超えると 1プライ先読みを抑制
DEPTH2_ENABLE: bool = True             # 2手先ビームを有効化
DEPTH2_BEAM_R: int = 6                 # 応答 r の上位何件まで 2プライ展開
DEPTH2_K2: int = 24                    # 2手目の予測サンプル数
DEPTH2_MAX_S: int = 2500               # |S| がこれを超えると 2プライ無効化

# スコア重み（多目的合成）
W_SIZE: float = 1.00      # 即時 |S_r|
W_MIN: float = 0.45       # 次手最大バケット規模の「最小値」
W_MEAN: float = 0.35      # 同「平均値」
W_Q10: float = 0.20       # 同「下位10%分位」
# 2プライの寄与
W2_MIN: float = 0.30
W2_MEAN: float = 0.20

の部分をいじることで、色々とカスタマイズできます。どうせこのコードを熟読する人はこのコードで使用しているモデルが分かる人だと思いますので、詳細は省略しますが、よくわからんけどやってみたいみたいな人は、ここだけいじれば大丈夫です。デフォルトでは八百長モードになっています。RIG_MODE: bool = falseで八百長しなくなります。

RIG_MODE: bool = True                # True: 八百長する / False: 八百長しない

を

Pythonの必要なモジュール

ありません。

ソースコード

hit_blow_host_regged.py

"""
Hit & Blow 八百長システム
"""

from __future__ import annotations

import random
import statistics
import time
from itertools import permutations, product
from typing import Dict, List, Sequence, Tuple

# ──────────────────────────────────────────────────────────
# 設定（必要なら変更可）
CODE_LENGTH: int = 3                 # 3 または 4（既定 = 3）
ALLOW_DUPLICATES: bool = False       # False: 桁重複 禁止 / True: 許容（既定 = 禁止）
ALLOW_LEADING_ZERO: bool = True      # 先頭 0 を許容
RIG_MODE: bool = True                # True: 八百長する / False: 八百長しない
SECRET_OVERRIDE: str | None = None   # 答えの数値を固定する。ただし RIG_MODE=False のときのみ有効

RNG_SEED: int | None = None          # None: UTCナノ秒 / int: 再現用
DETERMINISTIC_REPRESENTATIVE: bool = True  # True: 最小

# 学習モデル＆先読み
DECAY: float = 0.995
LAPLACE: float = 1.0

# 先読み規模（速度・堅牢性のトレードオフ）
LOOKAHEAD_K: int = 48                  # 次手予測サンプル数
LOOKAHEAD_MAX_S: int = 6000            # |S| がこれを超えると 1プレイ先読みを抑制
DEPTH2_ENABLE: bool = True             # 2手先ビームを有効化
DEPTH2_BEAM_R: int = 6                 # 応答 r の上位何件まで 2プレイ展開
DEPTH2_K2: int = 24                    # 2手目の予測サンプル数
DEPTH2_MAX_S: int = 2500               # |S| がこれを超えると 2プレイ無効化

# スコア重み（多目的合成）
W_SIZE: float = 1.00      # 即時 |S_r|
W_MIN: float = 0.45       # 次手最大バケット規模の「最小値」
W_MEAN: float = 0.35      # 同「平均値」
W_Q10: float = 0.20       # 同「下位10%分位」
# 2プライの寄与
W2_MIN: float = 0.30
W2_MEAN: float = 0.20
# ──────────────────────────────────────────────────────────

def hb(secret: str, guess: str) -> Tuple[int, int]:
    L = CODE_LENGTH
    h = sum(1 for i in range(L) if secret[i] == guess[i])
    b = sum(1 for ch in guess if ch in secret) - h
    return h, b

def validate_guess(text: str) -> str:
    L = CODE_LENGTH
    s = text.strip()
    if len(s) != L or not s.isdigit():
        sample = "012" if L == 3 else "0123"
        raise ValueError(f"推測は {L} 桁の数字で入力してください（例：{sample}）。")
    if not ALLOW_DUPLICATES and len(set(s)) != L:
        raise ValueError("各桁は相異である必要があります（重複は不可です）。")
    if not ALLOW_LEADING_ZERO and s[0] == "0":
        raise ValueError("先頭 0 は不可です。別の数を入力してください。")
    return s

def all_candidates() -> List[str]:
    digits = "0123456789"
    L = CODE_LENGTH
    out: List[str] = []
    if ALLOW_DUPLICATES:
        for tup in product(digits, repeat=L):
            if not ALLOW_LEADING_ZERO and tup[0] == "0":
                continue
            out.append("".join(tup))
    else:
        for tup in permutations(digits, L):
            if not ALLOW_LEADING_ZERO and tup[0] == "0":
                continue
            out.append("".join(tup))
    return out

def partition_by_response(cands: Sequence[str], guess: str) -> Dict[Tuple[int, int], List[str]]:
    buckets: Dict[Tuple[int, int], List[str]] = {}
    for s in cands:
        key = hb(s, guess)
        buckets.setdefault(key, []).append(s)
    return buckets

# ──────────────────────────────────────────────────────────
# ユーザモデル：位置条件付き分布 + 桁ペア共起
# ──────────────────────────────────────────────────────────
class UserModel:
    def __init__(self, L: int):
        self.L = L
        self.digit = [LAPLACE] * 10
        self.pos_digit = [[LAPLACE] * 10 for _ in range(L)]
        self.first_digit = [LAPLACE] * 10
        # 位置間の桁ペア（粗い相関の近似）：(i<j, d,e) の共起
        self.pair = [[[LAPLACE] * 10 for _ in range(10)] for __ in range(L*(L-1)//2)]
        self._pair_index = [(i, j) for i in range(L) for j in range(i+1, L)]
        self.repeat = [LAPLACE, LAPLACE]
        self.seen = 0

    def decay(self) -> None:
        for d in range(10):
            self.digit[d] *= DECAY
            self.first_digit[d] *= DECAY
        for i in range(self.L):
            for d in range(10):
                self.pos_digit[i][d] *= DECAY
        for k in range(len(self._pair_index)):
            for d in range(10):
                for e in range(10):
                    self.pair[k][d][e] *= DECAY
        self.repeat[0] *= DECAY
        self.repeat[1] *= DECAY

    def update_from_guess(self, guess: str) -> None:
        self.decay()
        self.seen += 1
        used = set(guess)
        has_rep = (len(used) != len(guess))
        self.repeat[1 if has_rep else 0] += 1.0
        for i, ch in enumerate(guess):
            d = ord(ch) - 48
            self.digit[d] += 1.0
            self.pos_digit[i][d] += 1.0
        self.first_digit[int(guess[0])] += 1.0
        # 桁ペア
        for idx, (i, j) in enumerate(self._pair_index):
            di = ord(guess[i]) - 48
            dj = ord(guess[j]) - 48
            self.pair[idx][di][dj] += 1.0

    def _weights_for_position(self, pos: int) -> List[float]:
        return self.pos_digit[pos][:]

    def sample_guess(self, allow_dup: bool, allow_leading_zero: bool) -> str:
        """桁ペア相関を弱く反映しつつ位置別分布を主としたサンプルを生成"""
        L = self.L
        out: List[int] = [-1] * L
        available = set(range(10))

        # 先頭
        w0 = self._weights_for_position(0)
        if not allow_leading_zero:
            w0[0] = 0.0
        d0 = roulette(w0)
        out[0] = d0
        if not allow_dup:
            available.discard(d0)

        # 残り
        for i in range(1, L):
            w = self._weights_for_position(i)
            # ペア補正（先頭との相関のみ簡易反映）
            idx = pair_index_lookup(0, i, L)
            if idx is not None:
                for d in range(10):
                    w[d] *= self.pair[idx][d0][d]
            if not allow_dup:
                for d in range(10):
                    if d not in available:
                        w[d] = 0.0
            pick = roulette(w)
            out[i] = pick
            if not allow_dup:
                available.discard(pick)

        return "".join(str(d) for d in out)

    def batch_predict(self, k: int, allow_dup: bool, allow_leading_zero: bool) -> List[str]:
        s: set[str] = set()
        trials = 0
        max_trials = k * 25
        while len(s) < k and trials < max_trials:
            trials += 1
            g = self.sample_guess(allow_dup, allow_leading_zero)
            s.add(g)
        return list(s)

def pair_index_lookup(i: int, j: int, L: int) -> int | None:
    if i >= j:
        return None
    idx = 0
    for a in range(L):
        for b in range(a+1, L):
            if a == i and b == j:
                return idx
            idx += 1
    return None

def roulette(weights: List[float]) -> int:
    s = sum(max(w, 0.0) for w in weights)
    if s <= 0.0:
        return random.randint(0, 9)
    r = random.random() * s
    acc = 0.0
    for d, w in enumerate(weights):
        acc += max(w, 0.0)
        if r <= acc:
            return d
    return 9

# ──────────────────────────────────────────────────────────
# スコアリング：情報利得を下げるための関数
# ──────────────────────────────────────────────────────────
def quantile(xs: List[float], q: float) -> float:
    if not xs:
        return 0.0
    xs_sorted = sorted(xs)
    k = max(0, min(len(xs_sorted)-1, int(q * (len(xs_sorted)-1))))
    return xs_sorted[k]

def choose_response_adversarial(S: List[str], guess: str, model: UserModel) -> Tuple[Tuple[int, int], List[str]]:
    """Ultra adversarial 応答選択：
       U(r) = W_SIZE * |S_r|
            + W_MIN  * min_{g' ~ model}[max_bucket_size(S_r, g')]
            + W_MEAN * mean_{g' ~ model}[max_bucket_size(S_r, g')]
            + W_Q10  * q10_{g' ~ model}[max_bucket_size(S_r, g')]
            (+ depth2 の寄与)
    """
    buckets = partition_by_response(S, guess)
    items = sorted(((resp, lst) for resp, lst in buckets.items()),
                   key=lambda x: -len(x[1]))

    do_lookahead = (len(S) <= LOOKAHEAD_MAX_S and LOOKAHEAD_K > 0)
    predicted = model.batch_predict(LOOKAHEAD_K, ALLOW_DUPLICATES, ALLOW_LEADING_ZERO) if do_lookahead else []

    best_key = None
    best_resp = None
    best_subS = None

    for rank, (resp, subS) in enumerate(items):
        base_size = float(len(subS))
        if do_lookahead and subS:
            max_sizes: List[float] = []
            for g2 in predicted:
                parts2 = partition_by_response(subS, g2)
                m2 = max((len(v) for v in parts2.values()), default=0)
                max_sizes.append(float(m2))
            if max_sizes:
                score_min = min(max_sizes)
                score_mean = statistics.fmean(max_sizes)
                score_q10 = quantile(max_sizes, 0.10)
            else:
                score_min = score_mean = score_q10 = 0.0
        else:
            score_min = score_mean = score_q10 = 0.0

        util = (W_SIZE * base_size
                + W_MIN * score_min
                + W_MEAN * score_mean
                + W_Q10 * score_q10)

        if DEPTH2_ENABLE and len(S) <= DEPTH2_MAX_S and subS and rank < DEPTH2_BEAM_R:
            predicted2 = model.batch_predict(DEPTH2_K2, ALLOW_DUPLICATES, ALLOW_LEADING_ZERO)
            second_scores = []
            for g2 in predicted2:
                parts2 = partition_by_response(subS, g2)
                if not parts2:
                    continue
                m2 = max(len(v) for v in parts2.values())
                second_scores.append(float(m2))
            if second_scores:
                util += (W2_MIN * min(second_scores) + W2_MEAN * statistics.fmean(second_scores))

        # タイブレーク：util 大 → H 小 → B 小 → 応答辞書順
        key = (util, -resp[0], -resp[1], tuple(resp))
        if best_key is None or key > best_key:
            best_key = key
            best_resp = resp
            best_subS = subS

    if best_resp is None:
        best_resp, best_subS = items[0]
    return best_resp, best_subS

# ──────────────────────────────────────────────────────────
# 表示
# ──────────────────────────────────────────────────────────
def print_history(history: List[Tuple[str, int, int]]) -> None:
    if not history:
        print("（履歴なし）")
        return
    print("手数  推測    判定")
    for idx, (g, h, b) in enumerate(history, 1):
        print(f"{idx:>3}  {g}   {h}H{b}B")

def print_profile(model: UserModel) -> None:
    L = model.L
    print(f"[学習統計] 観測手数 = {model.seen}")
    top_overall = sorted([(d, model.digit[d]) for d in range(10)], key=lambda x: -x[1])[:5]
    print("全体桁頻度（上位）:", " ".join(f"{d}:{w:.1f}" for d, w in top_overall))
    for i in range(L):
        top_i = sorted([(d, model.pos_digit[i][d]) for d in range(10)], key=lambda x: -x[1])[:5]
        print(f"位置 {i} 上位桁:", " ".join(f"{d}:{w:.1f}" for d, w in top_i))
    tot_rep = model.repeat[0] + model.repeat[1]
    p_norep = model.repeat[0] / tot_rep if tot_rep > 0 else 0.5
    p_rep = model.repeat[1] / tot_rep if tot_rep > 0 else 0.5
    print(f"重複なし傾向 ≈ {p_norep:.2%}／重複あり傾向 ≈ {p_rep:.2%}")

# ──────────────────────────────────────────────────────────
# ゲームルーチン
# ──────────────────────────────────────────────────────────
def representative_choice(S: List[str]) -> str:
    if not S:
        return "????"
    return min(S) if DETERMINISTIC_REPRESENTATIVE else random.choice(S)

def play_one_game() -> str:
    """1ゲームを実行。戻り値:
       - 'QUIT'  : ユーザが quit/exit/q を入力（即時全体終了）
       - 'DONE'  : ゲーム終了（勝利/降参等）。呼び出し側で再戦確認を行う。
    """
    if RNG_SEED is not None:
        random.seed(RNG_SEED)
    else:
        random.seed(time.time_ns())

    # 学習モデルはゲーム内のみ有効（終了時に破棄されます）
    model = UserModel(CODE_LENGTH)

    history: List[Tuple[str, int, int]] = []
    stack: List[Tuple[List[Tuple[str, int, int]], List[str], str | None]] = []
    start_ts = time.time()

    if RIG_MODE:
        S: List[str] = all_candidates()
        representative: str | None = representative_choice(S)
    else:
        S = []
        representative = SECRET_OVERRIDE if SECRET_OVERRIDE is not None else representative_choice(all_candidates())

    mode_len = f"{CODE_LENGTH}桁"
    mode_dup = "桁重複 許容" if ALLOW_DUPLICATES else "各桁相異（重複なし）"
    example = "012" if CODE_LENGTH == 3 else "0123"
    freedom = "異なる" if not ALLOW_DUPLICATES else "自由な"
    rig = "適応的八百長（整合厳守）" if RIG_MODE else "固定出題（通常モード）"

    print(f"\nヒット＆ブロー — Ultra Adversarial：{rig}（{mode_len}・{mode_dup}）")
    print(f"各桁が{freedom}{CODE_LENGTH}桁を入力してください（例：{example}）。")
    print("コマンド：history / undo / explain / profile / reveal / quit\n")

    while True:
        line = input(f"推測（{CODE_LENGTH}桁）＞ ").strip()
        lower = line.lower()

        if lower in ("quit", "exit", "q"):
            return "QUIT"

        if lower == "history":
            print_history(history)
            continue

        if lower == "undo":
            if not stack:
                print("取り消せません（これ以上戻れません）。")
            else:
                history, S, representative = stack.pop()
                print("直前の推測を取り消しました。")
                print_history(history)
            continue

        if lower == "explain":
            if RIG_MODE:
                print(f"|S| = {len(S)} 候補")
                if history:
                    last_guess = history[-1][0]
                    buckets = partition_by_response(S, last_guess)
                    max_size = max(len(v) for v in buckets.values())
                    print(f"直近手 {last_guess} に対する最大バケット規模: {max_size}")
            else:
                print("固定出題モードのため候補集合は公開しません。")
            continue

        if lower == "profile":
            print_profile(model)
            continue

        if lower == "reveal":
            if RIG_MODE:
                print(f"降参。いまの潜在的正解は {representative} です。")
            else:
                print(f"降参。解答は {representative} です。")
            print(f"試行回数：{len(history)}，経過時間：{time.time() - start_ts:.1f} 秒")
            return "DONE"

        # 数値入力
        try:
            guess = validate_guess(line)
        except ValueError as exc:
            print(f"エラー：{exc}")
            continue

        # 学習（推測観測）— ゲーム内のみ
        model.update_from_guess(guess)

        # Undo 用スナップショット
        stack.append((history[:], S[:], representative))

        if RIG_MODE:
            resp, subS = choose_response_adversarial(S, guess, model)
            S = subS
            h, b = resp
            representative = representative_choice(S)
            history.append((guess, h, b))
            print(f"判定：{h}H{b}B")

            if h == CODE_LENGTH and b == 0:
                print("\n正解")
                print(f"解答：{representative}")
                print(f"試行回数：{len(history)}，経過時間：{time.time() - start_ts:.1f} 秒")
                print("\n--- 履歴 ---")
                print_history(history)
                return "DONE"
        else:
            # 固定出題
            h, b = hb(representative, guess)
            history.append((guess, h, b))
            print(f"判定：{h}H{b}B")
            if h == CODE_LENGTH and b == 0:
                print("\n正解")
                print(f"解答：{representative}")
                print(f"試行回数：{len(history)}，経過時間：{time.time() - start_ts:.1f} 秒")
                print("\n--- 履歴 ---")
                print_history(history)
                return "DONE"

def main() -> None:
    if RNG_SEED is not None:
        random.seed(RNG_SEED)
    else:
        random.seed(time.time_ns())

    while True:
        status = play_one_game()
        if status == "QUIT":
            print("終了します。")
            break
        # 再戦確認（quit 以外の通常終了時のみ）。学習はゲームごとに破棄されます。
        ans = input("再度ゲームをしますか？（y/n）＞ ").strip().lower()
        if ans in ("y", "yes"):
            continue
        print("終了します。")
        break

if __name__ == "__main__":
    main()

以上です。お役に立てれば幸いです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up