I Implemented a "Dark Matter Coefficient" and Only Kyoto Broke — Horse Racing AI V4: The Truth Behind the "All-Horse Uniform Correction Wall" and venue_correction.csv
=== The Complete Record of a New Correction Algorithm Born in 7 Days ===
Python / Algorithm / Data Analysis / Architecture / Horse Racing AI
Period: May 17, 2026 — May 24, 2026 (2-day verification complete · Final version)
Author: yuji
"Why do the results differ when both use the same algorithm?" — That question was left unanswered at the end of the last article.
If V3 (Apr 1 – May 18) proved the design philosophy that "deletion is harder and more valuable than addition," then V4 faced an even more fundamental question: the discovery that a correction we believed was working had never been working at all.
In 7 days, we went from concept to discovery, design, implementation, and verification. One complete cycle.
V4 Period Summary
| Metric | Value |
|---|---|
| V58py changes (V4 period) | No.36–39 (4 changes) |
| Most important new CSV | venue_correction.csv (venue × jockey rank correction) |
| V3 confirmed baseline (403 races) | 3rd-place-or-better rate: 42.7% |
| 2-day verification result | 69 races total (May 23–24) completed |
| Biggest V4 discovery | Design vs. Implementation Gap ("all-horse uniform multiplication never changes rankings") |
| Biggest V4 concept |
"Venue Dark Matter Coefficient" (the origin of the venue_dm variable name) |
| Final verdict | V5py migration on hold (Kyoto venue_rank_f over-correction confirmed · re-verify after coefficient adjustment) |
1. "Same Algorithm, Different Results" — The Origin of V4
On May 17, while deep-diving into the 403-race confirmed dataset comparing V5py and V58py, this question emerged:
"V5py and V58py share the same core algorithm, yet their results differ. Isn't that worth chasing? Just pinning down the cause of that difference could push V5py's hit rate up several points."
During V3, we concluded that "rotation correction was noise" and deleted it. With the cleaned 403-race dataset in hand, the next question arose: "If they're twins sharing the same algorithm, why do they diverge?"
That question became the starting point of V4.
2. The "Dark Matter" Concept
While studying venue-by-venue data, a striking observation surfaced:
Nakayama 54.7% vs. Kyoto 38.3% — a 16-point gap.
That gap can't be explained by visible factors like horse ability, jockey skill, or track suitability alone.
"There's an invisible force in the universe called dark matter — it makes up 85% of all matter. If we think of a racetrack as a universe, maybe the same idea applies. Perhaps we're trying to capture those inexplicable 10–16 point differences as a single coefficient and track it down."
In statistical terms: unexplained variance. Something that exists as a residual beyond the reach of existing explanatory variables.
This became the direct origin of the variable name venue_dm (venue dark matter).
But a wall appeared immediately.
If the dark matter coefficient is venue-specific, it should affect all horses equally — but multiplying every horse by the same coefficient never changes the ranking. This had already been mathematically proven during V3.
"Multiplying all horses by a uniform coefficient doesn't change rankings or hit rates. The most suspicious thing is the calculation itself."
"Dark matter influences each horse differently" — this conclusion led to the design of venue_rank_f (venue × jockey rank individual coefficient).
3. "Design vs. Implementation Gap" — The Late-Night Cross-Check
At 3 AM on May 19, this instruction was issued:
"There are 16 correction values being handled by the mixer. Check whether any of the other formulas using them have the same kind of mistake. Review it at least 3 times. Normally, a separate person should be doing a double-check on work like this —"
After 3 rounds of review, a shocking fact emerged:
| Correction | Structure | Effect on Rankings | Problem |
|---|---|---|---|
| TRACK_FACTOR (10 venues) | All horses uniform | None | Same issue as basho_factor |
| CLASS_FACTOR (G1/G2/OP) | All horses uniform | None | Same race = same class |
| baba_f (track condition) | All horses uniform | None | Same race = same track |
| basho_factor | All horses uniform | None | Discovered this time |
| tekisei_f | Per horse individual | Yes | No problem |
| trainer_b | Per horse individual | Yes | No problem |
The CSV was being read. The calculation was in the formula. The log showed it running. Everything appeared to work correctly — yet none of it had any effect on the hit rate.
"If I hadn't pushed all the way to the bottom of this, it would have never been found."
The direction of the idea was correct: "there's something venue-specific affecting all horses" → encode it as a coefficient → right direction. But one fundamental math fact was overlooked: if you multiply every horse in the same race by the same coefficient, the rankings are absolutely unchanged.
Concept ✓ · Implementation ✗.
This was a direct application of semiconductor manufacturing quality control — "detect anomalies from yield fluctuation." The trigger for discovery was detecting the numerical anomaly that "uniform multiplication changes nothing."
4. The Solution: Birth of venue_correction.csv
On May 22, a design that broke through the "all-uniform wall" was implemented (V58py No.38).
# Structure of venue_correction.csv
# {(venue_name, attribute): factor}
# venue,attribute,factor
# Kyoto,dm,1.000 # venue_dm : same for all horses · no ranking effect
# Kyoto,rank_S,1.050 # venue_rank_f: only horses with jockey rank S
# Kyoto,rank_A,1.020 # venue_rank_f: only horses with jockey rank A
# Kyoto,rank_B,1.000 # venue_rank_f: only horses with jockey rank B
# Kyoto,rank_C,0.960 # venue_rank_f: only horses with jockey rank C ← the decisive key
# Added to the end of total calculation
total = (...) * tekisei_f * class_f * baba_f * basho_factor * venue_dm * venue_rank_f
Two coefficient types were introduced for a reason:
| Coefficient | Role | Effect on Rankings |
|---|---|---|
venue_dm |
Venue-level coefficient (same for all horses) | None (uniform · preserved as "dark matter") |
venue_rank_f |
Venue × jockey rank (differs per horse) | Yes ✅ (different jockey ranks get different coefficients within the same race) |
Because venue_rank_f applies different coefficients based on jockey rank within the same race, it broke through the all-uniform multiplication wall for the first time.
5. 3-Layer → 4-Layer Architecture in V4
[Layer 1] Default values (constants)
↓ class_f, baba_f etc. — same for all horses
[Layer 2] basho_correction_kensyou.csv (per venue)
↓ basho_factor + venue_dm (uniform · no ranking effect)
[Layer 3] tekisei_hoseichi.csv (per horse)
↓ tekisei_f (varies per horse · affects rankings)
[Layer 4] venue_correction.csv (venue × jockey rank)
↓ venue_rank_f (varies per horse · ✅ affects rankings ← V4's most important)
6. V58py No.36–39 Change Log
| No. | Date | Change |
|---|---|---|
| No.36 | May 19 | Complete migration of V5py algorithm · cloning complete. Pure verification environment realized. |
| No.37 | May 20 | Retired TRACK_FACTOR · renamed track_f → tekisei_f · added CSV usage logging. Synced with V5py No.66. |
| No.38 | May 22 | 🔑 Introduced venue_correction.csv · implemented venue_dm and venue_rank_f. V4's most important change. |
| No.39 | May 23 | Fixed venue*C full coefficient display · NaN guard in 7 locations · datetime fix. Began 2-day verification. |
7. 2-Day Full-Race Verification — Results and Final Verdict
Verification Results (May 23–24 · 69 races total)
| Metric | Pre-change V58py (BM · 403R) | May 23 (34R) | May 24 (35R) | Assessment |
|---|---|---|---|---|
| 1st-place hit rate | 15.4% | 20.6% | 19.4% | ✅ Improved |
| Top-3 hit rate | 44.0% | 38.2% | — | ⚠️ Deteriorated |
| Tokyo top-3 rate | 40.0% | 41.7% | 50.0% | ✅ Improved |
| Kyoto top-3 rate | 40.4% | 33.3% | 22.0% | ❌ Collapsed |
Kyoto venue_rank_f Measured Values — The Core Problem
Jockey Rank venue*C avg Effect
S 0.647 -35% penalty ← excessive
A 0.634 -37% penalty ← excessive AND lower than S (unnatural)
B 1.095 +10% boost ← excessive
C 0.889 -11% penalty
Concrete example of damage:
Race 4 — Etoile Bouquet (Yokoyama K., Rank A): pre-adjustment index 16.38 (race high) → penalized to 10.13 (8th place). Actual finish: 1st.
The same pattern repeated on both days. The +10% boost for Rank B and -35% penalty for Ranks A and S are confirmed as over-correction.
Final Verdict
| Decision | Details |
|---|---|
| V5py migration | On hold (top-3 rate fell below baseline of 44.0%) |
| Kyoto venue_rank_f | Over-correction confirmed (same pattern on both days) |
| Tokyo / Niigata | Broadly effective (V58py at par or slightly better) |
| Next week's plan | Create venue_correction_design.py (hosei_mixer-style tool) → adjust Kyoto coefficients → re-verify |
8. Design Philosophy Learned from V4
Lesson 1: "Corrections That Don't Work" Are Hard to Find
The CSV is being read. The formula includes it. The log shows it running. Everything looks correct — yet it has absolutely no effect on the hit rate. You cannot find this kind of bug without deeply asking "what is this calculation actually for?" Surface-level code review will never catch it.
Lesson 2: Dark Matter Influences Each Horse Differently
The initial hypothesis — "venue-specific forces should affect all horses equally" — was wrong. What the data showed was "something that works in favor of certain horses while working against others." This led to the venue_rank_f design.
Lesson 3: New Features Go to V58py First, Then Migrate to V5py After Verification
Starting in V4, the correct staged design was finally realized: "V58py is the forward-deployment environment for new features; migrate to V5py only after verification confirms improvement." The decision to hold off on migration is itself proof that the cycle is working correctly.
V4 Confirmed Coding Rules
| # | Rule |
|---|---|
| ① | Increment No.XX on every change · no skipping |
| ② | Verify header comments against actual code with grep before updating |
| ③ | Run scan_dependencies.py after every py change |
| ④ | Always update _batch_dirs when updating batch folder (absolute rule) |
| ⑤ | Before introducing any new CSV, confirm at the design stage: "Will all-horse uniform multiplication affect rankings?" (Added in V4) |
Rule ⑤ was born from V4's "design vs. implementation gap" discovery. Rules come from problems actually encountered, not theory — that philosophy hasn't changed since V3.
Closing — Into the V5 Period
| Task | Details |
|---|---|
| 🔜 Create venue_correction_design.py | hosei_mixer-style interactive design tool. Build a safe system for managing coefficients. |
| 🔜 Retire basho_correction (2 files) | Both are hollow at 1.000 across all venues. Consolidate into venue_correction.csv. |
| 🔜 Adjust Kyoto venue_rank_f | B: 1.095 → 1.02–1.05 / A·S: 0.63–0.65 → 0.88–0.95 (loosen). Re-verify next week. |
| 🔜 Migrate venue_rank_f to V5py | Migrate only after next week's re-verification confirms top-3 rate above 42.7%. |
| 🔜 Back-port trainer_b | Port V58py's penalty design (thin-record trainers -0.5) to V5py. Target: +3.2pt 1st-place rate. |
If V3's biggest discovery was the value of deletion — "rotation correction was noise" — then V4's biggest discovery was structural: "a correction we believed was working had never worked at all."
And the "hold" verdict this time is not a failure. We identified exactly where the coefficients were too large, and we can correct them next week — that is V4's real achievement.
"Theory doesn't change the design. Real data does."
That philosophy, written in the first article, holds here too — in the challenge of encoding dark matter into a formula.
reisugo_keibayosou_app_v1_hikaku_v58.py v5.8 No.39 / seiki_keibayosou_app_v5.py v5.7 No.67
2-day verification: All races 1R–12R completed / 2026-05-23 to 2026-05-24
Next issue (Vol. 4): venue_correction_design.py completion · re-verification results after Kyoto coefficient adjustment · V5py migration decision.