The AI Said "You'll Win." I Lost. — Failure Analysis of AI-Assisted Pro Se Litigation with Gemini 3.0 Pro
Tags
AI Gemini PromptEngineering LegalTech FailureAnalysis
TL;DR
- Built a legal AI system (Project Themis) on Gemini 3.0 Pro to handle pro se litigation
- AI constructed logically sound arguments (estoppel, selective inaction, good faith doctrine)
- AI predicted partial victory with high confidence: "You will win."
- Result: total defeat (all claims dismissed)
- Root causes: AI cannot predict judicial impression, underestimates structural legal barriers, and generates dangerous overconfidence
- Countermeasure: multi-AI cross-check + forced defeat scenario output prompts (templates provided)
Background
This article was published before the appeal deadline. It is a technical failure analysis, not legal advice.
Expressions such as "was refused" represent the author's litigation claims. The court did not adopt any of them.
I am a stay-at-home father in Hokkaido, Japan. Not a lawyer, not an engineer. I have ADHD and cannot manage multiple communication channels simultaneously.
In August 2025, a dispute arose with a platform company (Meetsmore, Inc.) after a cleaning service caused health issues. I requested communication accommodation (single point of contact) but perceived the request as refused. Attorney retainer fees would have been ¥200,000–300,000 (~$1,300–2,000).
Instead, I used Gemini 3.0 Pro with my custom legal prompt framework "Project Themis" to build and file the case myself.
System Architecture: Project Themis
Project Themis is a legal reasoning prompt framework for Gemini 3.0 Pro. Core design:
- Forces structured legal reasoning (operative facts → legal framework → conclusion)
- Strips emotional framing from arguments
- Auto-detects contradictions in opposing party's statements (estoppel detection)
- Builds evidence-to-claim mapping (exhibit cross-referencing)
For full details, see the Project Themis article and GitHub repository.
Case Overview
Case: Reiwa 7 (Wa) No. 2015 — Damages Claim, Sapporo District Court
| Date | Event |
|---|---|
| 2025/8/6 | Cleaning ordered via Meetsmore. Residual odor |
| 2025/8/13 | Re-cleaning (increased chemicals). Headache/nausea onset |
| 2025/8/14 | Medical certificate obtained. Accommodation requested |
| 2025/8/15 | Direct negotiation with contractor. Mental state deteriorated |
| 2025/8/25 | Complaint filed |
| 2025/11/17 | Defendant's answer |
| 2025/12/2 | Oral argument concluded (same-day adjudication) |
| 2026/1/29 | Judgment: all claims dismissed |
Contested Issues
| Issue | Plaintiff's Claim | Defendant's Rebuttal |
|---|---|---|
| Reasonable accommodation | Duty to adjust communication for disability (single channel). Claimed refusal | "Intervention" = fundamental business alteration. Outside accommodation scope |
| Duty of care | Failed initial response after health damage report | Platform is intermediary only. No such duty |
| Causation | Defendant's inaction worsened plaintiff's condition | Plaintiff chose to negotiate directly. No causal link to defendant |
Claims: Primary ¥3,000,000; subsidiary ¥300,000.
AI Output: What Gemini Predicted
Attack Logic Constructed by Gemini
Gemini analyzed defendant's email records (Exhibits 13, 14) and built two attack lines:
1. Selective Inaction
Defendant claims "intervention is impossible."
However, Exhibit 14 shows defendant acknowledged that "chat restoration" and "requesting contractor contact" were possible — and partially executed.
This is not "cannot" but "could have, chose not to."
= Selective inaction → failure to provide reasonable accommodation.
2. Estoppel
Defendant's litigation claim that intervention = "unauthorized practice of law" is post-hoc.
Reversing from "possible" to "impossible" once in court violates the estoppel doctrine.
Gemini's Judgment Prediction
After oral arguments, Gemini predicted:
Judgment: Partial victory. ¥330,000 awarded.
Reasoning: Defendant's "impossibility" claim contradicted by own admissions. Estoppel applies.
Confidence: "You will win."
Actual Result: Total Defeat
All claims dismissed. Litigation costs borne by plaintiff.
Prediction vs. Reality
| Item | Gemini's Prediction | Actual Judgment |
|---|---|---|
| Disposition | ¥330,000 awarded | All claims dismissed |
| Reasonable accommodation | Selective inaction = unlawful | Fundamental alteration. No duty |
| Duty of care | SOS neglect = gross negligence | Not a contracting party. No duty |
| Causation | Defendant's inaction → damage | Plaintiff's own choice. No link |
| Estoppel | Contradictory claims impermissible | Not mentioned at all |
Failure Analysis
Failure 1: Could Not Predict How Text Would Be Read
Root cause: Gemini optimized for logical argument construction, not judicial impression prediction.
My email (Exhibit 4) contained:
- "I would like your opinion on whether duty of care applies"
- "I hope to discuss compensation for health/property damage"
I meant these casually. The judge read them as demands for substantive intervention.
AI reads text. Judges read context. Gemini could not bridge this gap.
Failure 2: Underestimated Legal Structural Barriers
The court treated the platform as "an entity that provides a venue." Terms of service stating "we do not intervene in disputes" were respected. The Disability Discrimination Elimination Act's "reasonable accommodation" was ruled not to extend to fundamental business alteration.
Gemini's estoppel attack was architecturally correct but aimed at the wrong abstraction layer. The court dismissed at the "does the duty exist?" layer, never reaching "was the duty fulfilled?"
Failure 3: Generated Overconfidence
"You will win." — This assertion:
- Gave the user confidence
- Dulled critical verification
- Prevented adequate preparation for defeat scenarios
If the output had been "60% probability; key risks: [list]," preparation would have been different.
AI assertive prediction → user judgment distortion. This is a critical failure mode for legal AI systems.
Cost/Time Analysis
Costs
| Item | Amount |
|---|---|
| Court filing fee | ~¥20,000 |
| Postage | ~¥5,000 |
| Medical certificate | ~several thousand yen |
| Total | ~¥30,000 (~$200) |
Attorney retainer alone: ¥200,000–300,000. Pro se with AI: 1/10 the cost.
Time
| Phase | Duration | Notes |
|---|---|---|
| Complaint drafting | ~1 week | Gemini + author built skeleton |
| Formatting by wife | 2 days (weekend) | Former judicial scrivener. Format only, no substance |
| Preparatory briefs (2 rounds) | 1–2 weeks each | Author + AI only |
| Filing → judgment | ~5 months |
What AI Could and Could Not Do
Could Do ✅
| Capability | Detail |
|---|---|
| Legal logic construction | Operative facts, estoppel, good faith doctrine — correctly applied |
| Document formatting | Court-submission-compliant format generated |
| Evidence organization | Chronological timeline + exhibit mapping |
| Contradiction detection | Found logical contradictions in defendant's communications |
| Counterargument prediction | Enumerated anticipated rebuttals + prepared responses |
Could Not Do ❌
| Limitation | Detail |
|---|---|
| Judicial impression prediction | Could not evaluate how emails would be "read" |
| Legal structure evaluation | Missed threshold dismissal: "does duty exist at all?" |
| Win probability estimation | Asserted "you'll win" — actual result: total defeat |
| Risk quantification | Failed to adequately present defeat scenarios |
Conclusion: AI is a "logic machine," not a "judicial impression prediction machine."
Countermeasures: Prompt Templates for Forced Risk Output
The root failure was AI overconfidence. These templates force risk surfacing.
Template 1: Forced Defeat Scenario
Regarding the following lawsuit, enumerate "reasons you will lose"
in equal volume to "reasons you will win."
Optimistic predictions are not needed. List every point where the
court might reject the plaintiff's claims.
[Case Summary]
(Insert facts here)
Template 2: Judicial Reading Simulation
Create three interpretation scenarios for the following email,
as a judge might read it.
1. Favorable: Reading in the plaintiff's favor
2. Neutral: Reading the text literally
3. Hostile: Reading against the plaintiff
[Email Text]
(Insert email content here)
Template 3: Threshold Dismissal Identification
Regarding the following lawsuit, enumerate all points where the
court might dismiss the claim BEFORE reaching the merits.
Specifically:
- Does the defendant have the alleged obligation? (duty existence)
- Does the plaintiff have a valid claim? (right of action)
- Are there points where causation is severed?
[Case Summary]
(Insert facts here)
Lessons Learned
- Never trust a single AI's "you'll win."
- Cross-check with multiple AIs.
- Force defeat scenario output.
- Check threshold dismissal points before building attack logic.
- Re-read your own communications through hostile eyes.
In an ongoing second lawsuit, I have built a multi-AI cross-check system (GPT + Claude + Gemini) to prevent overconfidence. Results will be published after that case concludes.
Conclusion
I lost. No regrets.
AI gave me access to the court for ¥30,000 instead of ¥300,000. I stated my claims publicly and received a formal judgment. My anger dissolved in the process.
"Swallowing the loss" = giving up while carrying anger.
"Going to court" = putting anger into words and releasing it.
Don't trust AI. Make AI doubt itself.
The more confidently it guarantees victory, the more you should question it.
Related
Published under MIT License. dosanko_tousan / v5.3 AI Collaboration Framework / February 4, 2026