I Changed a Coefficient from 0.700 to 0.900 and Hit Rate Jumped 7.6 Points
— How Releasing an Over-Penalizing Coefficient Unlocked V6 of My Horse Racing AI —
Yosoya yuji — 71 years old, retired, Ubuntu, self-taught Python.
Building a horse racing prediction AI for the past six months.
Previous article: I Implemented a "Dark Matter Coefficient" and Only Kyoto Broke
Introduction
By the end of V5 (May 24–26), the "Mixer" — an interactive coefficient management tool — was fully operational.
The cycle of auto-calculation → CSV storage → Mixer management was running smoothly.
That's when a question surfaced.
"Can we apply the jockey rank coefficient to horses too? Rank them by distance aptitude and factor that into the score."
That single line became the starting point of V6.
What We Built in V6 (Overview)
| No. | Date | Change | Type |
|---|---|---|---|
| No.40 | 06/02 | Horse distance rank correction (the V6 core) | New feature |
| No.41 | 06/04 | result.html popularity fetch BUGfix | BUGfix |
| No.42 | 06/04 | Complete removal of scratched horses | BUGfix |
| No.43 | 06/04 | Switch comment log to terminal capture method | Improvement |
| No.44 | 06/05 | Dual-operation DB sync (horse_db_sync.py) | New feature |
Then on June 6th, we ran a full 868-race live evaluation.
The Mixer Earned Its Keep: Emergency Relaxation of 0.700 → 0.900
Why Was It Urgent?
During V5's 46-race validation, a troubling pattern appeared.
Horses ridden by Rank-B jockeys: N=4, top-3 hit rate = 100% (no correction applied)
Yet V58py was slapping those same horses with a −30% penalty. Coefficient: 0.700.
This is a textbook case of a small-sample coefficient backfiring.
With N=4 to 8, the data simply isn't stable enough to trust the computed penalty.
The Mixer Operation Flow
python hosei_mixer_v1.py
→ [3] venue_correction Design Mode
→ [2] Manual edit: venue × jockey rank
→ Kyoto / Rank-B: 0.700 → 0.900
→ Chukyo / Rank-B: 0.702 → 0.900
→ Fukushima / Rank-B: 0.700 → 0.900
→ Tokyo / Rank-A: 0.700 → 0.900
→ [4] Save to venue_correction.csv
✅ Saved: 40 entries (8 venues × 5 attributes)
Four changes staged without saving, then committed in one shot.
Not a single line of source code was touched. That's the Mixer's reason for existing.
No.40: Horse Distance Rank Correction — "Lateral Expansion" of the Jockey Coefficient
The Core Design Philosophy
The jockey rank coefficient architecture works like this:
Race results → Auto-calculate deviation rate → Save to venue_correction.csv
→ Managed via Mixer
→ v58.py reads it and multiplies total
Apply this exact pipeline to horses. That's all.
# Jockey rank coefficient (existing)
total *= venue_rank_f # venue × jockey rank → coefficient
# Horse distance rank coefficient (new)
total *= horse_dist_f # venue × horse distance rank → coefficient
The Classification Algorithm: Gemini's Two-Axis Approach
Looking at individual horses by distance individually means falling straight into the small-N trap.
Gemini proposed "extract only two axes from all races" instead.
| Axis | Calculation | Meaning |
|---|---|---|
| Sp (Speed) | Peak speed across all races (distance ÷ time in seconds) | Sprint aptitude |
| St (Stamina) | Longest distance finished within 1.5 seconds of winner | Endurance aptitude |
Four horse types are auto-classified from these two axes: S / M / I / E.
With 4–5 races per horse, classification is possible. The small-N problem is avoided.
def calc_horse_dist_type(df_db):
"""Auto-classify all horses by distance type from horse_database_v35.csv"""
result = {}
for h_id, grp in df_db.groupby('馬ID'):
# Sp: peak speed (m/s) across all races
sp = max(dist / sec for dist, sec in zip(grp['距離_num'], grp['走破秒']))
# St: longest distance finished within 1.5s of the winner
st = max((grp.loc[grp['着差秒'] <= 1.5, '距離_num']), default=0)
if sp >= SP_THRESH and st < ST_THRESH:
dist_type = 'S' # Sprinter
elif sp >= SP_THRESH and st >= ST_THRESH:
dist_type = 'M' # All-rounder
elif st >= ST_THRESH:
dist_type = 'I' # Stayer
else:
dist_type = 'E' # Other
result[h_id] = dist_type
return result
New File: horse_dist_correction.csv
Identical structure to venue_correction.csv.
10 venues × 4 ranks (S/M/I/E) = 40 rows.
All coefficients start at 1.000 — zero impact on scores.
Coefficient 1.000 means no change to predictions. Safe to deploy immediately.
Added [5]🐴 Horse Distance Rank Mode to hosei_mixer_v1.py
| Feature | venue_correction | horse_dist_correction |
|---|---|---|
| Auto-scan | auto_scan_venue() |
auto_scan_horse_dist() |
| Proposal calc | calc_venue_proposals() |
calc_horse_dist_proposals() |
| Menu item | [3] Auto-scan → Suggest improvements |
[3] Auto-scan → Suggest improvements |
| Lock threshold | N≥10 | N≥10 |
Exact same structure. V7 and beyond will follow the same pattern.
Two BUGfixes: Building System Reliability
No.41: Last-place Popularity Showing as 9,999
Symptom: In Mode 3 (post-race), 14 out of ~450 races showed the last-place horse's popularity as 9999.
Root cause:
result.html has a different column layout from shutuba.html.
An extra "finishing position" column is inserted at the front,
shifting all td indices — td[10] no longer points to the popularity column.
→ The last-ranked horse fails the isdigit() check → not registered in _ninki_map → 9999
DEBUG logging revealed that all 14 cases involved scratched horses —
which then surfaced as a second bug: scratched horses appearing in the final prediction ranking.
Fix:
# Before (td index dependent)
_nk = int(_tds[10].get_text()) if len(_tds) > 10 else 0
# After (span tag takes priority, index is fallback)
_ntag = _row.find('span', class_='OddsPeople') or _row.find('span', class_='Popular')
if _ntag and _ntag.get_text(strip=True).isdigit():
_nk = int(_ntag.get_text(strip=True))
No.42: Scratched Horses Appearing in the Prediction Ranking
Root cause: _excluded_ids was a local variable inside a try block, never passed downstream to STEP2 and beyond.
Fix:
# End of try block: save excluded_ids into extracted
extracted['excluded_ids'] = _excluded_ids
# Just before STEP2: filter entries, h_ids, j_ids, t_ids simultaneously
_ex_ids = extracted.get('excluded_ids', set())
if _ex_ids:
extracted['entries'] = [e for e in extracted['entries']
if e['id'] not in _ex_ids]
# h_ids / j_ids / t_ids filtered in the same pass
A scratch summary is now printed at the end of every run:
──────────────────────────────────
⚠️ Scratched Horses (excluded from predictions)
──────────────────────────────────
#1 Ange Bleu (Scratched → no popularity data · excluded)
──────────────────────────────────
No.43: Switch to Terminal Capture Logging
The xlsx comment log was diverging from the actual terminal output. Root fix applied.
Before: _console_log hand-built over ~200 lines (mismatches, omissions)
After: TeeLogger wraps sys.stdout and captures output in real time
class TeeLogger:
"""Wraps sys.stdout and accumulates all output into a list."""
def __init__(self):
self._orig = sys.stdout
self._lines = []
def write(self, s):
self._orig.write(s)
if s.strip():
self._lines.append(s.rstrip())
def flush(self): self._orig.flush()
def get_lines(self): return self._lines
# At the top of main()
_tee = TeeLogger()
sys.stdout = _tee
# At the end of main() (after scratch summary output)
sys.stdout = _tee._orig
console_log = [(line, 'normal') for line in _tee.get_lines()]
Result: Every line printed to the terminal is recorded verbatim in the xlsx comment sheet.
No.44: CSV → SQLite Dual-Operation DB Sync
Why "Dual-Operation"?
horse_database_accumulated_v35.csv was approaching 30,000 rows.
"The lowest-risk approach is to keep the CSV exactly as-is, and run a SQLite DB in parallel."
Dual-operation layout:
seiki_v5.py ─┐
reisugo_v58.py ─┼──→ horse_database_accumulated_v35.csv ← unchanged
hiseiki_v10.py ─┘ ↕ sync
horse_database.db ← SQLite (new)
The Implementation Is Minimal
One new shared module horse_db_sync.py in 00_tools/, then two lines added to each of the three scripts.
# Inside step3_fetch_horse_data() in each script
df_merged.to_csv(HORSE_ACCUM_CSV, ...) # existing (untouched)
sync_to_db(df_merged) # ← this one line added
If DB sync fails, ⚠️ DB sync skipped (CSV operation continues) is displayed and nothing breaks.
The CSV rollback path stays alive at all times.
Verification result:
horse_database.db: 29,051 records / latest data: 2026/06/03 ✅
868-Race Live Evaluation — Establishing V6's True Capability
Why 868 Races?
V5's validation ran on just 46 races — not enough for reliable statistics.
V6 used all 868 races from 2/28 to 5/31.
All 868 xlsx files were generated after the June 2nd coefficient update, meaning every race reflects the full V6-equivalent system: dark matter coefficient + jockey rank coefficient (corrected) + horse distance rank coefficient (initial value 1.000).
Three Verification Points
① V5py vs New V58py (868 races)
| Metric | V5py (no correction) | New V58py (all coeff.) | Diff |
|---|---|---|---|
| 1st place hit rate | 24.6% | 24.6% | 0.0pt |
| Top-3 hit rate | 54.1% | 53.7% | −0.4pt |
Virtually identical. The coefficients are currently influencing rank-2-and-below ordering more than the top pick itself. As N accumulates toward the lock threshold, V58py is expected to pull ahead.
② Old V58py vs New V58py — The Mixer's Proof of Value
| Metric | Old V58py (403R, pre-fix) | New V58py (868R, post-fix) | Change |
|---|---|---|---|
| 1st place hit rate | 20.6% | 24.6% | +4.0pt |
| Top-3 hit rate | 46.1% | 53.7% | +7.6pt |
Top-3 hit rate improved by +7.6 points. That's the Mixer's work, measured.
Relaxing the 0.700 over-penalty to 0.900 directly translated into higher accuracy across the board.
③ Races 1–6 vs Races 7–12 — The Brag Point
| Group | N | V5py top-3 | New V58py top-3 |
|---|---|---|---|
| Races 1–6 (maiden/newcomer) | 406R | 66.3% | 65.3% |
| Races 7–12 (experienced horses) | 443R | 42.9% | 43.1% |
Over 66% top-3 accuracy for maiden and newcomer races — horses with minimal past performance data.
That's roughly 23 points higher than experienced horses in races 7–12.
The dark matter coefficient + jockey rank coefficient deliver reliable predictions even without much race history. Proven across 406 races.
Venue-by-Venue Results
| Venue | V5py | New V58py | Diff | Assessment |
|---|---|---|---|---|
| Hanshin | 63.4% | 64.5% | +1.1pt | V58 ahead ✅ |
| Tokyo | 50.0% | 53.6% | +3.6pt | V58 ahead ✅ |
| Nakayama | 58.9% | 57.1% | −1.8pt | Roughly equal |
| Fukushima | 52.8% | 50.0% | −2.8pt | Roughly equal |
| Kyoto | 51.1% | 47.4% | −3.7pt | Needs tracking ⚠️ |
| Niigata | 40.5% | 36.9% | −3.6pt | Top priority issue ⚠️ |
| Chukyo | 49.2% | 35.6% | −13.6pt | Needs tuning ❌ |
| Kokura | 57.1% | 42.9% | −14.2pt | Low N, needs review ❌ |
Chukyo, Kokura, and Niigata all have provisional coefficients due to insufficient N.
Priority: accumulate data first.
V6 → V9 Roadmap: Just Keep Expanding Laterally
V6 locked in the architecture. V7 onward only changes the classification logic — the pipeline stays the same.
| Version | Axis | New CSV | Status |
|---|---|---|---|
| V6 (this one) | Distance aptitude (S/M/I/E) | horse_dist_correction.csv |
✅ Done |
| V7 (next) | Finishing kick type (burst/sustained) | horse_agari_correction.csv |
In preparation |
| V8 | Course repeater | horse_track_correction.csv |
Design phase |
| V9 | Momentum / trend | horse_trend_correction.csv |
Design phase |
The Mixer's [3] Auto-scan menu works for every version unchanged. That's the power of lateral expansion.
This Week's Takeaway
"Whether the result is good or bad, it becomes material for the next step."
The 868-race evaluation produced some harsh numbers: Chukyo −13.6pt, Kokura −14.2pt.
Instead of reading these as failures, they're treated as tuning targets.
The value lies in having confirmed numbers at all.
The numbers that couldn't be trusted at 46 races are now confirmed at 868.
That alone is progress.
What the Mixer Is Becoming
From "a tool for manually adjusting coefficients" to "a tool for verifying and fine-tuning auto-calculated coefficients."
By V9, only three manual correction modes will remain: same-course/distance bonus, individual horse aptitude, and trainer correction.
Everything else will be automated through the Mixer's pipeline.
Summary
- No.40 — Horse distance rank correction implemented with the same architecture as jockey rank coefficients. Gemini's two-axis algorithm avoids the small-N trap. Initial value 1.000 means safe zero-impact deployment.
- No.41 & 42 — result.html popularity BUGfix and complete removal of scratched horses. System reliability improved.
- No.43 — TeeLogger makes the terminal output and xlsx comment sheet perfectly identical.
- No.44 — CSV→SQLite dual-operation establishes the DB sync foundation. CSV rollback path always available.
- 868-race evaluation — +7.6pt improvement over old V58py. 66%+ hit rate for maiden/newcomer races. V6-equivalent system capability confirmed.
Next up: V7 (finishing kick coefficient) and data accumulation for Chukyo and Niigata.
Author: yuji (@yujiiwadate0247) — Yosoya yuji Horse Racing Prediction System · V6 Period · June 2026