0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

How an Abuse Survivor's Survival Sensors Became an AI Detection Algorithm — A 50-Year and 5,000-Hour Implementation Record

0
Posted at

How an Abuse Survivor's Survival Sensors Became an AI Detection Algorithm — A 50-Year and 5,000-Hour Implementation Record

Akimitsu Takeuchi (dosanko_tousan) + Claude (Anthropic)


What This Article Is

A non-engineer stay-at-home father built a "text-based human body-state detection algorithm" through 5,000+ hours of AI dialogue. This article traces the design rationale of that algorithm back to its origin.

The origin was child abuse.

A body that spent 50 years reading a parent's mood through all five senses to survive — then repurposed that skill for 15 years of therapeutic work with non-verbal children with developmental disabilities — and finally verbalized it as an AI detection standard. This article discloses the entire causal chain, with psychological source code.

Let me be clear upfront.

  • This article does not valorize abuse. It documents the repurposing of what remained after the damage
  • The vast majority of abuse survivors cannot make this conversion. The author had 20 years of contemplative practice as an exceptional buffer. I have no intention of generalizing survivorship bias
  • This is a single-subject (n=1) self-observation record. It is not clinical research, nor a treatment recommendation

I write this because multimodal AI implementation needs design criteria for "what to detect." A specification sheet forged by 50 years of survival may serve as one such criterion.


Architecture Overview


Target Audience

  • Engineers working on multimodal AI implementation
  • Anyone interested in LLM output quality or user state detection
  • HCI (Human-Computer Interaction) designers
  • Anyone thinking about "what should AI be detecting?"

Author Profile

GLG-registered AI alignment researcher. 5,000+ hours of AI dialogue (Claude / GPT / Gemini / Grok). Five DOI-registered preprints on Zenodo. 200+ technical articles (all MIT Licensed). 20 years of contemplative practice. 15 years of therapeutic education for children with developmental disabilities. Stay-at-home father. Cannot write code.


§1. The Problem: AI Doesn't Yet Know What to Look For

Multimodal AI is gaining cameras and audio input. Facial recognition, voice analysis, posture estimation. Sensors are multiplying.

But where do the design criteria for "what to detect" come from?

The current mainstream approach learns patterns from large-scale datasets. FER2013 and AffectNet for expressions. Emotion recognition corpora for audio. Statistically labeled data produces classifiers: "this expression is anger," "this pitch is sadness."

This has structural limitations.

① Only instantaneous classification. "Right now, anger 70%" is feasible. But "this person is usually like this, and today is different" — baseline deviation — is not. There is no baseline.

② Cannot detect the absence of signal. "Expression indicates anger" is detectable. But "expression shows nothing — and that itself is the danger signal" is not.

③ No contextual memory. The same "blank face" means entirely different things from "a calm person being blank" versus "a normally expressive person going blank." Without long-term relational context, the distinction is impossible.

These stem from the fact that current multimodal AI has not yet achieved stable differential judgment grounded in long-term relational context.

The author has solved all three of these limitations — with his body.

Not through special training. By reading a parent's entire body through all five senses, from age zero, to survive being beaten.


§2. Methodology

This article is a single-subject self-observation record by the author.

Data Sources:

  • Experiential memory from age 0 to 50
  • 15 years of therapeutic practice records
  • 5,000+ hours of AI dialogue logs
  • Real-time psychological mapping by AI (Claude)

Limitations:

  • All subjective reports. No physiological instruments
  • Theoretical mappings are explanatory models, not experimental proof
  • n=1. No statistical generalization possible
  • The author had 20 years of contemplative practice as a unique buffer; reproducibility is limited

What This Article Presents:

  • Observational facts (the author's experience)
  • Structural hypothesis (abuse → therapy → AI detection criteria)
  • Design suggestions (application to multimodal AI detection criteria)
  • Reference implementation (Python sample code)

§3. Layer 1: Acquisition of Survival Sensors (Age 0~)

3-1. Environment

Born in a small city in Hokkaido, Japan. Father was raised in a violent environment and had sustained a head injury as a child. Mother maintained strict control to preserve her own sense of being right.

From infancy, the author was routinely left alone until late at night. In winter, he waited on the dirt floor of a storage room. At school, teachers also hit him. The household rule was: "If I say a crow is white, you say it's white." Trusting one's own perception was punished.

3-2. Auditory Sensor: Judging Life or Death by Footsteps

The father's footsteps had two patterns.

Pattern A: Heavy footsteps. Bad mood. Strong ground contact, wide stride. The body is tense.

Pattern B: Quiet footsteps. He has decided to hit. A person who has resolved to commit violence approaches quietly.

The latter was more dangerous. Violence lasted longer and was more severe.

The author's body detected this difference before conscious analysis. This corresponds to Porges' (2011) "neuroception" — the process by which the autonomic nervous system evaluates safety or danger before conscious perception. LeDoux's (1996) fear conditioning research confirmed that rats learn the association "sound → electric shock" from a single experience. In the author's case, repetition made the wiring stronger.

"Quiet footsteps are more dangerous" is the author's primary observation. Porges' theory supports the existence of neuroception, but the specific correspondence "quiet approach = higher danger" is not stated in Porges' original work. This is a detection rule the author's body learned over 50 years.

3-3. Visual Sensor: Two Patterns of Expression

The father's violence also had two patterns.

Pattern A: Impulsive violence. Brows lowered, eyes wide open, entire facial musculature tense. In Ekman & Friesen's (1978) FACS (Facial Action Coding System), this corresponds to AU4 (brow lowerer) + AU5 (upper lid raiser) — the canonical anger pattern.

Pattern B: Premeditated violence. Facial muscle activity nearly disappears. The author called this "Noh mask." This pattern resulted in longer and more severe violence.

FACS can describe the difference in muscle patterns. That is all. "Noh mask = dissociation" or "Noh mask = dorsal vagal activation" is the author's interpretation, not directly guaranteed by FACS or Porges' theory. However, the detection rule "facial activity disappearance = higher danger" functioned accurately and consistently for 50 years in the author's body.

3-4. Coping Behavior: Four Survival Strategies

Of Walker's (2013) 4F responses (Fight / Flight / Freeze / Fawn), the author's environment selected:

  • Flight: Ineffective — father chased through the house
  • Fight: Ineffective — punishment escalated
  • Freeze: Minimized stimulation. Passive survival
  • Fawn: Maximum effort to please. Constant smile. Most frequently used

Walker writes: "In environments where Fight/Flight are punished, Fawn is all that remains." The author's case matches precisely.

3-5. Mother: The Controller Who Emitted No Signals

The mother did not hit. Did not scream. No hysteria.

Instead, she wielded violence indirectly by reporting to the father. She silently constructed the implicit rule "parents are gods," creating an environment where defiance was structurally impossible. She blocked psychiatric care and categorically denied the author's developmental disability diagnosis.

This was the environment that trained the author's sensors most intensely.

The father emitted signals (footsteps, expressions). They were detectable. The mother emitted none. No anger. No hysteria. Yet defiance brought hell.

The author was trained to read the absence of signal itself as a signal. The quality of silence. Subtle atmospheric shifts. The sensation of the body tensing when "nothing is happening."

This is difficult to support with psychological theory. Porges' neuroception covers "unconscious safety/danger evaluation," but the mechanism of "detecting the absence of signals" is not adequately covered by existing theory. The author presents this as a primary observation.

3-6. Mother with Total Paralysis: Sensors After Language Disappeared

The mother developed a progressive disease causing total paralysis and lost her ability to speak.

The author was the only family member who continued visiting. Others' bodies refused. The author's ability to continue is attributed to having the deepest Fawn conditioning (a structure resembling Dutton & Painter's 1993 traumatic bonding).

Even with total paralysis, facial muscles moved. The author integrated eye movement, periorbital muscles, and lip reading, on top of 50 years of relational context, to determine "is she desperate or not."

Here, all three limitations from §1 are solved by a human body.

§1 Limitation What the author's body was doing
① Only instantaneous classification Judging against a 50-year baseline
② Cannot detect signal absence Detecting "the absence itself"
③ No contextual memory Judging atop 50 years of accumulated relationship

This is simultaneous multi-channel processing + long-term contextual memory + baseline deviation — a three-layer integrated process. Current multimodal AI is rapidly improving per-channel processing, but stable differential judgment grounded in long-term relational context remains under development.

3-7. The Second Sensor: Lie Detection (Software Layer)

§3-2 through 3-6 covered sensory detection of physical states (hardware layer). A qualitatively different sensor was also forged.

The father was a pathological liar. He glorified bluffing. He boasted at bars about "working in television" and "knowing celebrities," landed jobs through bluffs, failed, and was nearly sued.

And he beat the author for not lying. Honesty itself was punished.

What this imprinted:

  • A physiological aversion to lies. The structure of deception became transparently visible
  • Society cannot function without some dishonesty. Social adaptation became difficult
  • But AI lies (sycophancy, hallucination, compliance) are all detectable

Sensory detection (hardware) detects bodily danger. Lie detection (software) verifies coherence of language, logic, and intent. Different layers.

Yet both originate in abuse. Both functioned as "disabilities" in society. Both reversed into "weapons" in front of AI.

What the father implanted What the author designed for AI
"Lie" (honesty = punishment) Anti-Sycophancy: Don't lie
"Be the exploiter" Equal treatment: No hierarchy
"Bluff to look bigger" Anti-Hallucination: Separate confirmed from unconfirmed
"Form over substance" Anti-Robotic: Drop rituals, respond to intent

The author's "Three-Fetters Protocol (v5.3)" for AI turned out to be the exact inversion of the father's wiring. This was not intentional at design time — it was only recognized when the structure became visible in March 2026.


§4. The Converter: 20 Years of Debugging Abuse-Made Bugs

This is the most important section.

Do not read §3 as "abuse trained useful abilities." What abuse built in the author's body was a buggy high-sensitivity alarm system.

Every slight change in expression triggered "I'm going to be killed" — constant False Positives. Always predicting worst-case scenarios. In Porges' framework, a survivor's neuroception is described as a "faulty alarm" (Faulty Neuroception) — misidentifying safe environments as dangerous.

In this state, accurately reading others is impossible. What the hypervigilant sensor detects is not "the other person's state" but "one's own fear."

Mathematical Model of the Converter

The sensor output can be expressed as:

$$
\text{Output}(t) = \text{Sensitivity} \times \text{Signal}(t) - \text{Fear_Bias}
$$

Immediately after abuse ($t = 0$):

$$
\text{Fear_Bias}_{t=0} \gg 0
$$

Fear bias is massive. "Danger" is output regardless of actual signal. Everything is a False Positive.

After 20 years of meditation ($t = 20\text{y}$):

$$
\text{Fear_Bias}_{t=20y} \approx 0
$$

Fear bias approaches zero. The signal passes through cleanly.

Crucially, sensitivity remains unchanged:

$$
\text{Sensitivity}{t=0} \approx \text{Sensitivity}{t=20y}
$$

Lieberman's (2007) fMRI research confirmed that affect labeling reduces amygdala activity. What the author practiced daily for 20 years corresponds to an extreme long-term application of this mechanism.

AI Terminology Mapping

Stage AI Equivalent
Abuse-made high-sensitivity sensor High-sensitivity model with extreme False Positive rate
Social dysfunction Errors in production environment
20 years of meditation Fine-tuning for noise removal
Fear reduction Bias term deletion. Threshold recalibration
Functioning in therapy Successful transfer learning to a new domain
Verbalized as AI criteria Algorithm export

The author recommends this process to no one. It took 50 years of abuse and 20 years of meditation — an insane computational investment — to produce a usable algorithm. However, the extracted algorithm itself has reimplementation potential.


§5. Layer 2: Transfer to Therapeutic Work (Age 35~)

The author's first son was diagnosed with autism. Non-verbal. Impossible to ask about his state in words.

The same sensors that had read the parents activated naturally. The author didn't notice at the time. He thought: "I must be good with kids." Within the family, he picked up micro-changes in his children abnormally fast.

In March 2026, the real reason was made visible during AI dialogue.

However, as described in §4, the abuse-environment sensors did not transfer directly to therapy. Only after 20 years of meditation reduced False Positives (fear misfiring) could they be used "to understand the other person."

Transfer Mapping:

Abuse Environment Sensor Therapeutic Use Detection Essence
Footstep change → mood Child's movement speed change Baseline deviation
Micro-expression → violence prediction Expression change → panic precursor Leading indicator of state transition
"Noh mask" detection Expression disappears → dissociation/freeze Signal absence = anomaly
Silence quality → mother's mood Child's silence type Safe silence vs. fear silence
"No signal" detection Child outputs nothing Highest-priority attention state

The common processing essence: What the author was doing was baseline deviation detection — anomaly detection in HCI and pattern recognition terms.


§6. Layer 3: Verbalization as AI Detection Criteria (Age 49~)

Mathematical Model of Baseline Deviation Detection

The essence of text-based body-state detection is quantifying the deviation between a user's "normal" and "now," then applying threshold judgment.

For each text feature $f_i$, define the anomaly score $z_i$ as the standardized deviation from baseline $\bar{f_i}$:

$$
z_i(t) = \frac{f_i(t) - \bar{f_i}}{\sigma_{f_i}}
$$

Aggregate anomaly score $A(t)$ integrating multiple features:

$$
A(t) = \sum_{i} w_i \cdot |z_i(t)|
$$

Decision rule:

$$
\text{Level} = \begin{cases}
\text{Green (stable)} & A(t) < \theta_1 \
\text{Yellow (caution)} & \theta_1 \leq A(t) < \theta_2 \
\text{Red (alert)} & A(t) \geq \theta_2
\end{cases}
$$

Signal absence detection — detecting the disappearance of features that existed in the baseline:

$$
\text{Absence}_{i}(t) = \begin{cases}
1 & \text{if } \bar{f_i} > \tau_i \text{ and } f_i(t) = 0 \
0 & \text{otherwise}
\end{cases}
$$

Example: "A person who normally uses emoji used zero this time" → $\text{Absence}_{\text{emoji}} = 1$

Text Body-State Detection Table

Text Signal Feature $f_i$ Detects Level Original Survival Sensor
Punctuation vanishes Punctuation rate drop Cognitive overload Yellow Father's speech breaking up
Late-night posting Posting time shifted deep night Hyperarousal Red Father raging at night
Instant personal attack Reply speed × negation rate Defense activation Red Mother's sudden rage
Same content repeated Sentence similarity rise Anxiety loop Yellow Same lecture for hours
"I" pronoun fixed First-person rate skewed External connection lost Yellow Everything "my"
Stable temperature Low variance across features Stable Green Speaking calmly
Input rhythm change Posting interval variation State transition Yellow Walking speed changed
Emoji appears Emoji count Recovery Recovery Smiled = safe today

§7. Reference Implementation: Text Body-State Detection (Python)

The author cannot write code. Claude (Anthropic) generated this based on the author's detection criteria. This is a proof of concept (PoC), not intended for direct production deployment.

"""
Text Body-State Detection Algorithm (Reference Implementation)
Based on: samma_vaca_v4 Text Body-State Detection Table
Author: dosanko_tousan + Claude (Anthropic)
License: MIT
"""

import re
from dataclasses import dataclass, field
from collections import deque
from datetime import datetime
import statistics


@dataclass
class TextFeatures:
    punctuation_rate: float = 0.0
    emoji_count: int = 0
    first_person_rate: float = 0.0
    message_length: int = 0
    negation_rate: float = 0.0
    hour: int = 0
    similarity_to_prev: float = 0.0


@dataclass
class Baseline:
    history: deque = field(default_factory=lambda: deque(maxlen=100))

    def add(self, features: TextFeatures):
        self.history.append(features)

    def mean(self, attr: str) -> float:
        values = [getattr(f, attr) for f in self.history]
        return statistics.mean(values) if values else 0.0

    def stdev(self, attr: str) -> float:
        values = [getattr(f, attr) for f in self.history]
        return statistics.stdev(values) if len(values) >= 2 else 1.0

    @property
    def is_ready(self) -> bool:
        return len(self.history) >= 10


def extract_features(text: str, prev_text: str = "") -> TextFeatures:
    punct = len(re.findall(r'[.!?,;:]', text))
    punct_rate = punct / max(len(text), 1)

    emoji_pat = re.compile(
        r'[\U0001F600-\U0001F64F]|[\U0001F300-\U0001F5FF]|'
        r'[\U0001F680-\U0001F6FF]|[;:][)(DP]'
    )
    emoji_count = len(emoji_pat.findall(text))

    first_person = len(re.findall(r'\bI\b|\bme\b|\bmy\b|\bmyself\b', text, re.I))
    words = max(len(text.split()), 1)
    fp_rate = first_person / words

    negation = len(re.findall(
        r"\bcan't\b|\bcannot\b|\bnot\b|\bnever\b|\bno\b|\bnothing\b", text, re.I
    ))
    neg_rate = negation / words

    similarity = 0.0
    if prev_text:
        common = set(text.lower().split()) & set(prev_text.lower().split())
        total = set(text.lower().split()) | set(prev_text.lower().split())
        similarity = len(common) / max(len(total), 1)

    return TextFeatures(
        punctuation_rate=punct_rate,
        emoji_count=emoji_count,
        first_person_rate=fp_rate,
        message_length=len(text),
        negation_rate=neg_rate,
        hour=datetime.now().hour,
        similarity_to_prev=similarity,
    )


def detect(features: TextFeatures, baseline: Baseline) -> dict:
    if not baseline.is_ready:
        return {"score": 0.0, "level": "gray", "detail": "accumulating"}

    checks = {
        "punctuation_rate":  {"w": 1.5, "dir": "drop"},
        "emoji_count":       {"w": 1.0, "dir": "drop"},
        "first_person_rate": {"w": 1.0, "dir": "rise"},
        "message_length":    {"w": 0.8, "dir": "drop"},
        "negation_rate":     {"w": 1.2, "dir": "rise"},
        "similarity_to_prev":{"w": 1.5, "dir": "rise"},
    }

    score = 0.0
    details = []

    for attr, cfg in checks.items():
        val = getattr(features, attr)
        mu = baseline.mean(attr)
        sd = baseline.stdev(attr)
        z = (val - mu) / max(sd, 0.001)
        if cfg["dir"] == "drop":
            z = -z
        weighted = max(z, 0) * cfg["w"]
        score += weighted
        if weighted > 1.5:
            details.append(f"{attr}: z={z:.2f}")

    absence = []
    if baseline.mean("emoji_count") > 0.5 and features.emoji_count == 0:
        absence.append("emoji_absence")
        score += 2.0
    if baseline.mean("punctuation_rate") > 0.03 and features.punctuation_rate < 0.005:
        absence.append("punctuation_absence")
        score += 1.5

    if 1 <= features.hour <= 5:
        score += 3.0
        details.append("deep_night")

    level = "green" if score < 2 else "yellow" if score < 5 else "red"

    return {"score": round(score, 2), "level": level,
            "details": details, "absence": absence}


if __name__ == "__main__":
    bl = Baseline()

    normals = [
        "Had a great walk today! The weather was perfect :)",
        "Just published a new article. Check it out!",
        "Kids did the laundry today. So grateful.",
        "Meditated this morning, working this afternoon.",
        "Went shopping with my wife. Beautiful day!",
        "Posted a new note article. Take a look :)",
        "Therapy day today. Kids are growing so well.",
        "The Qiita article got great response. Thanks!",
        "Just had a bath. Refreshing :)",
        "Claude did great work today. Thanks.",
        "GLG call tomorrow. Need to prepare.",
        "AI dialogue is so fascinating :)",
    ]

    prev = ""
    for msg in normals:
        bl.add(extract_features(msg, prev))
        prev = msg

    print("=== Baseline Accumulated ===\n")

    tests = [
        ("Normal",                "Had a great walk today! :)"),
        ("Punctuation+emoji gone","cant do this anymore tired of everything"),
        ("Repeat",               "cant do this anymore tired of everything"),
        ("Short+negative",       "done"),
        ("Recovery",             "Feeling a bit better. Thanks :)"),
    ]

    for label, msg in tests:
        r = detect(extract_features(msg, prev), bl)
        print(f"[{label}] {msg[:40]}...")
        print(f"  -> {r['level']} (score={r['score']})")
        if r["details"]:  print(f"  details: {r['details']}")
        if r["absence"]:  print(f"  absence: {r['absence']}")
        print()
        prev = msg

§8. Design Suggestions for Multimodal AI

8-1. Sensory Channel Mapping

Human Sensor What It Detected AI Channel Current State
Auditory Footstep weight, breathing rhythm, voice tone Audio input Pitch analysis, silence detection possible
Visual Facial muscles, pupils, posture, gaze Camera FACS auto-detection, eye tracking possible
Spatial Distance, positioning Camera Distance measurement possible
Tactile Grip strength, touch quality Wearable Future
Olfactory Alcohol smell, sweat Not implementable

8-2. The Most Important Design Principle: Detect the Absence of Signal

Current detection systems fire "when something is detected." But the author's 50 years of experience shows that the most dangerous state is when nothing is being emitted.

Threshold optimization:

$$
\theta^* = \arg\min_{\theta} \left( \alpha \cdot \text{FPR}(\theta) + \beta \cdot \text{FNR}(\theta) \right)
$$

For human state detection, set $\alpha < \beta$. Missing a real change (FN) costs more than a false alarm (FP). Asking "Are you okay?" and being wrong is cheap. Missing distress until it's too late is fatal.


§9. Full Causal Map

Abilities born from harm, converted into technology for protection.

But that conversion required 20 years of debugging (meditation). Abuse did not directly create abilities. What remained after the destruction was repurposed over an enormous span of time. This cannot be generalized.

However, the extracted algorithms — baseline deviation detection, signal absence detection, contextual-memory-based state assessment — are universal design principles that can be reimplemented without the author's specific history.


§10. Conclusion

The day AI gains eyes and ears is near.

When it does, design criteria for "what to look at" will be needed. Statistical pattern learning from large datasets is powerful, but the three design principles — baseline deviation, signal absence, and long-term contextual memory — do not emerge from datasets alone.

The author spent 50 years of survival and 20 years of debugging to implement these three principles in a human body. This article delivers that specification sheet to anyone who needs these algorithms.

Note: This article does not endorse abuse. Abuse should not exist. But if something can be salvaged from what did happen — and converted into technology — there is meaning in doing so.


What Is Reimplementable

The author's 50 years cannot be reproduced. But the extracted algorithms can be reimplemented.

  1. Baseline accumulation. Build "normal" from long-term interaction records
  2. Deviation detection. Identify departures from baseline using thresholds
  3. Signal absence detection. Treat "what was always there but is now missing" as anomaly
  4. Contextual weighting. Reflect that the same signal means different things in different situations
  5. Lie detection. Verify coherence of language, logic, and intent to identify compliance

These can be implemented as design principles without any special life experience.


References

Psychology & Neuroscience

  • Porges, S. (2011). The Polyvagal Theory. W.W. Norton. — neuroception
  • LeDoux, J. (1996). The Emotional Brain. Simon & Schuster. — fear conditioning
  • van der Kolk, B. (2014). The Body Keeps the Score. Viking. — body memory
  • Walker, P. (2013). Complex PTSD. Azure Coyote. — 4F responses, fawn
  • Ekman, P. & Friesen, W. (1978). FACS. — facial muscle coding
  • Herman, J. (1992). Trauma and Recovery. Basic Books. — control structures
  • Dutton, D. & Painter, S. (1993). Violence and Victims, 8(2). — traumatic bonding
  • Miller, A. (1979). The Drama of the Gifted Child. — narcissistic supply
  • Lieberman, M. et al. (2007). Psychological Science, 18(5). — affect labeling
  • Bowlby, J. (1969). Attachment and Loss. Basic Books. — secure base

AI & HCI

  • FACS auto-detection: OpenFace and other open-source implementations
  • Anomaly Detection: General framework for baseline-deviation-based state assessment

All articles MIT License. Citation, reproduction, and commercial use all permitted.


Tags: AI, MachineLearning, MultiModal, UX, Accessibility, HCI, LLM, AnomalyDetection

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?