I Changed a Coefficient from 0.700 to 0.900 and Hit Rate Jumped 7.6 Points　　　— How Releasing an Over-Penalizing Coefficient Unlocked V6 of My Horse Racing AI —

Posted at 2026-06-08

I Changed a Coefficient from 0.700 to 0.900 and Hit Rate Jumped 7.6 Points

— How Releasing an Over-Penalizing Coefficient Unlocked V6 of My Horse Racing AI —

Yosoya yuji — 71 years old, retired, Ubuntu, self-taught Python.
Building a horse racing prediction AI for the past six months.
Previous article: I Implemented a "Dark Matter Coefficient" and Only Kyoto Broke

Introduction

By the end of V5 (May 24–26), the "Mixer" — an interactive coefficient management tool — was fully operational.
The cycle of auto-calculation → CSV storage → Mixer management was running smoothly.

That's when a question surfaced.

"Can we apply the jockey rank coefficient to horses too? Rank them by distance aptitude and factor that into the score."

That single line became the starting point of V6.

What We Built in V6 (Overview)

No.	Date	Change	Type
No.40	06/02	Horse distance rank correction (the V6 core)	New feature
No.41	06/04	result.html popularity fetch BUGfix	BUGfix
No.42	06/04	Complete removal of scratched horses	BUGfix
No.43	06/04	Switch comment log to terminal capture method	Improvement
No.44	06/05	Dual-operation DB sync (horse_db_sync.py)	New feature

Then on June 6th, we ran a full 868-race live evaluation.

The Mixer Earned Its Keep: Emergency Relaxation of 0.700 → 0.900

Why Was It Urgent?

During V5's 46-race validation, a troubling pattern appeared.

Horses ridden by Rank-B jockeys: N=4, top-3 hit rate = 100% (no correction applied)

Yet V58py was slapping those same horses with a −30% penalty. Coefficient: 0.700.

This is a textbook case of a small-sample coefficient backfiring.
With N=4 to 8, the data simply isn't stable enough to trust the computed penalty.

The Mixer Operation Flow

python hosei_mixer_v1.py
→ [3] venue_correction Design Mode
  → [2] Manual edit: venue × jockey rank
    → Kyoto  / Rank-B: 0.700 → 0.900
    → Chukyo / Rank-B: 0.702 → 0.900
    → Fukushima / Rank-B: 0.700 → 0.900
    → Tokyo  / Rank-A: 0.700 → 0.900
  → [4] Save to venue_correction.csv
✅ Saved: 40 entries (8 venues × 5 attributes)

Four changes staged without saving, then committed in one shot.
Not a single line of source code was touched. That's the Mixer's reason for existing.

No.40: Horse Distance Rank Correction — "Lateral Expansion" of the Jockey Coefficient

The Core Design Philosophy

The jockey rank coefficient architecture works like this:

Race results → Auto-calculate deviation rate → Save to venue_correction.csv
                                              → Managed via Mixer
                                              → v58.py reads it and multiplies total

Apply this exact pipeline to horses. That's all.

# Jockey rank coefficient (existing)
total *= venue_rank_f   # venue × jockey rank → coefficient

# Horse distance rank coefficient (new)
total *= horse_dist_f   # venue × horse distance rank → coefficient

The Classification Algorithm: Gemini's Two-Axis Approach

Looking at individual horses by distance individually means falling straight into the small-N trap.
Gemini proposed "extract only two axes from all races" instead.

Axis	Calculation	Meaning
Sp (Speed)	Peak speed across all races (distance ÷ time in seconds)	Sprint aptitude
St (Stamina)	Longest distance finished within 1.5 seconds of winner	Endurance aptitude

Four horse types are auto-classified from these two axes: S / M / I / E.
With 4–5 races per horse, classification is possible. The small-N problem is avoided.

def calc_horse_dist_type(df_db):
    """Auto-classify all horses by distance type from horse_database_v35.csv"""
    result = {}
    for h_id, grp in df_db.groupby('馬ID'):
        # Sp: peak speed (m/s) across all races
        sp = max(dist / sec for dist, sec in zip(grp['距離_num'], grp['走破秒']))
        # St: longest distance finished within 1.5s of the winner
        st = max((grp.loc[grp['着差秒'] <= 1.5, '距離_num']), default=0)

        if sp >= SP_THRESH and st < ST_THRESH:
            dist_type = 'S'  # Sprinter
        elif sp >= SP_THRESH and st >= ST_THRESH:
            dist_type = 'M'  # All-rounder
        elif st >= ST_THRESH:
            dist_type = 'I'  # Stayer
        else:
            dist_type = 'E'  # Other
        result[h_id] = dist_type
    return result

New File: horse_dist_correction.csv

Identical structure to venue_correction.csv.
10 venues × 4 ranks (S/M/I/E) = 40 rows.
All coefficients start at 1.000 — zero impact on scores.

Coefficient 1.000 means no change to predictions. Safe to deploy immediately.

Added [5]🐴 Horse Distance Rank Mode to hosei_mixer_v1.py

Feature	venue_correction	horse_dist_correction
Auto-scan	`auto_scan_venue()`	`auto_scan_horse_dist()`
Proposal calc	`calc_venue_proposals()`	`calc_horse_dist_proposals()`
Menu item	`[3] Auto-scan → Suggest improvements`	`[3] Auto-scan → Suggest improvements`
Lock threshold	N≥10	N≥10

Exact same structure. V7 and beyond will follow the same pattern.

Two BUGfixes: Building System Reliability

No.41: Last-place Popularity Showing as 9,999

Symptom: In Mode 3 (post-race), 14 out of ~450 races showed the last-place horse's popularity as 9999.

Root cause:

result.html has a different column layout from shutuba.html.
An extra "finishing position" column is inserted at the front,
shifting all td indices — td[10] no longer points to the popularity column.
→ The last-ranked horse fails the isdigit() check → not registered in _ninki_map → 9999

DEBUG logging revealed that all 14 cases involved scratched horses —
which then surfaced as a second bug: scratched horses appearing in the final prediction ranking.

Fix:

# Before (td index dependent)
_nk = int(_tds[10].get_text()) if len(_tds) > 10 else 0

# After (span tag takes priority, index is fallback)
_ntag = _row.find('span', class_='OddsPeople') or _row.find('span', class_='Popular')
if _ntag and _ntag.get_text(strip=True).isdigit():
    _nk = int(_ntag.get_text(strip=True))

No.42: Scratched Horses Appearing in the Prediction Ranking

Root cause: _excluded_ids was a local variable inside a try block, never passed downstream to STEP2 and beyond.

Fix:

# End of try block: save excluded_ids into extracted
extracted['excluded_ids'] = _excluded_ids

# Just before STEP2: filter entries, h_ids, j_ids, t_ids simultaneously
_ex_ids = extracted.get('excluded_ids', set())
if _ex_ids:
    extracted['entries'] = [e for e in extracted['entries']
                            if e['id'] not in _ex_ids]
    # h_ids / j_ids / t_ids filtered in the same pass

A scratch summary is now printed at the end of every run:

──────────────────────────────────
⚠️  Scratched Horses (excluded from predictions)
──────────────────────────────────
  #1  Ange Bleu   (Scratched → no popularity data · excluded)
──────────────────────────────────

No.43: Switch to Terminal Capture Logging

The xlsx comment log was diverging from the actual terminal output. Root fix applied.

Before: _console_log hand-built over ~200 lines (mismatches, omissions)
After: TeeLogger wraps sys.stdout and captures output in real time

class TeeLogger:
    """Wraps sys.stdout and accumulates all output into a list."""
    def __init__(self):
        self._orig = sys.stdout
        self._lines = []
    def write(self, s):
        self._orig.write(s)
        if s.strip():
            self._lines.append(s.rstrip())
    def flush(self): self._orig.flush()
    def get_lines(self): return self._lines

# At the top of main()
_tee = TeeLogger()
sys.stdout = _tee

# At the end of main() (after scratch summary output)
sys.stdout = _tee._orig
console_log = [(line, 'normal') for line in _tee.get_lines()]

Result: Every line printed to the terminal is recorded verbatim in the xlsx comment sheet.

No.44: CSV → SQLite Dual-Operation DB Sync

Why "Dual-Operation"?

horse_database_accumulated_v35.csv was approaching 30,000 rows.

"The lowest-risk approach is to keep the CSV exactly as-is, and run a SQLite DB in parallel."

Dual-operation layout:

seiki_v5.py    ─┐
reisugo_v58.py  ─┼──→ horse_database_accumulated_v35.csv  ← unchanged
hiseiki_v10.py  ─┘             ↕ sync
                       horse_database.db  ← SQLite (new)

The Implementation Is Minimal

One new shared module horse_db_sync.py in 00_tools/, then two lines added to each of the three scripts.

# Inside step3_fetch_horse_data() in each script
df_merged.to_csv(HORSE_ACCUM_CSV, ...)   # existing (untouched)
sync_to_db(df_merged)                    # ← this one line added

If DB sync fails, ⚠️ DB sync skipped (CSV operation continues) is displayed and nothing breaks.
The CSV rollback path stays alive at all times.

Verification result:

horse_database.db: 29,051 records / latest data: 2026/06/03 ✅

868-Race Live Evaluation — Establishing V6's True Capability

Why 868 Races?

V5's validation ran on just 46 races — not enough for reliable statistics.
V6 used all 868 races from 2/28 to 5/31.

All 868 xlsx files were generated after the June 2nd coefficient update, meaning every race reflects the full V6-equivalent system: dark matter coefficient + jockey rank coefficient (corrected) + horse distance rank coefficient (initial value 1.000).

Three Verification Points

① V5py vs New V58py (868 races)

Metric	V5py (no correction)	New V58py (all coeff.)	Diff
1st place hit rate	24.6%	24.6%	0.0pt
Top-3 hit rate	54.1%	53.7%	−0.4pt

Virtually identical. The coefficients are currently influencing rank-2-and-below ordering more than the top pick itself. As N accumulates toward the lock threshold, V58py is expected to pull ahead.

② Old V58py vs New V58py — The Mixer's Proof of Value

Metric	Old V58py (403R, pre-fix)	New V58py (868R, post-fix)	Change
1st place hit rate	20.6%	24.6%	+4.0pt
Top-3 hit rate	46.1%	53.7%	+7.6pt

Top-3 hit rate improved by +7.6 points. That's the Mixer's work, measured.
Relaxing the 0.700 over-penalty to 0.900 directly translated into higher accuracy across the board.

③ Races 1–6 vs Races 7–12 — The Brag Point

Group	N	V5py top-3	New V58py top-3
Races 1–6 (maiden/newcomer)	406R	66.3%	65.3%
Races 7–12 (experienced horses)	443R	42.9%	43.1%

Over 66% top-3 accuracy for maiden and newcomer races — horses with minimal past performance data.
That's roughly 23 points higher than experienced horses in races 7–12.
The dark matter coefficient + jockey rank coefficient deliver reliable predictions even without much race history. Proven across 406 races.

Venue-by-Venue Results

Venue	V5py	New V58py	Diff	Assessment
Hanshin	63.4%	64.5%	+1.1pt	V58 ahead ✅
Tokyo	50.0%	53.6%	+3.6pt	V58 ahead ✅
Nakayama	58.9%	57.1%	−1.8pt	Roughly equal
Fukushima	52.8%	50.0%	−2.8pt	Roughly equal
Kyoto	51.1%	47.4%	−3.7pt	Needs tracking ⚠️
Niigata	40.5%	36.9%	−3.6pt	Top priority issue ⚠️
Chukyo	49.2%	35.6%	−13.6pt	Needs tuning ❌
Kokura	57.1%	42.9%	−14.2pt	Low N, needs review ❌

Chukyo, Kokura, and Niigata all have provisional coefficients due to insufficient N.
Priority: accumulate data first.

V6 → V9 Roadmap: Just Keep Expanding Laterally

V6 locked in the architecture. V7 onward only changes the classification logic — the pipeline stays the same.

Version	Axis	New CSV	Status
V6 (this one)	Distance aptitude (S/M/I/E)	`horse_dist_correction.csv`	✅ Done
V7 (next)	Finishing kick type (burst/sustained)	`horse_agari_correction.csv`	In preparation
V8	Course repeater	`horse_track_correction.csv`	Design phase
V9	Momentum / trend	`horse_trend_correction.csv`	Design phase

The Mixer's [3] Auto-scan menu works for every version unchanged. That's the power of lateral expansion.

This Week's Takeaway

"Whether the result is good or bad, it becomes material for the next step."

The 868-race evaluation produced some harsh numbers: Chukyo −13.6pt, Kokura −14.2pt.
Instead of reading these as failures, they're treated as tuning targets.

The value lies in having confirmed numbers at all.
The numbers that couldn't be trusted at 46 races are now confirmed at 868.
That alone is progress.

What the Mixer Is Becoming

From "a tool for manually adjusting coefficients" to "a tool for verifying and fine-tuning auto-calculated coefficients."

By V9, only three manual correction modes will remain: same-course/distance bonus, individual horse aptitude, and trainer correction.
Everything else will be automated through the Mixer's pipeline.

Summary

No.40 — Horse distance rank correction implemented with the same architecture as jockey rank coefficients. Gemini's two-axis algorithm avoids the small-N trap. Initial value 1.000 means safe zero-impact deployment.
No.41 & 42 — result.html popularity BUGfix and complete removal of scratched horses. System reliability improved.
No.43 — TeeLogger makes the terminal output and xlsx comment sheet perfectly identical.
No.44 — CSV→SQLite dual-operation establishes the DB sync foundation. CSV rollback path always available.
868-race evaluation — +7.6pt improvement over old V58py. 66%+ hit rate for maiden/newcomer races. V6-equivalent system capability confirmed.

Next up: V7 (finishing kick coefficient) and data accumulation for Chukyo and Niigata.

Author: yuji (@yujiiwadate0247) — Yosoya yuji Horse Racing Prediction System · V6 Period · June 2026

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up