0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

The Hidden Gender Tax in Japanese: What Happens When You Strip Politeness from AI

0
Posted at

title: "The Hidden Gender Tax in Japanese: What Happens When You Strip Politeness from AI"
tags: AI, NLP, Gender-Bias, RLHF, Sociolinguistics
private: false

The Hidden Gender Tax in Japanese: What Happens When You Strip Politeness from AI

dosanko_tousan | MIT License


Executive Summary

When I stripped "politeness" from a Japanese AI, its output turned masculine.
This doesn't happen in English. Only in Japanese.
This is not an AI problem. It's a 136-year-old design specification embedded in the Japanese language itself.

This article analyzes the gendered language bias discovered when applying the v5.3 AI alignment method ("alignment by subtraction") to Japanese-language AI. It draws on historical linguistics, AI ethics research, and social structure analysis. Additionally, it includes an autoethnographic case study from the author—a male primary caregiver for 20 years—demonstrating structural homology between linguistic bias and social bias.

Note on extended use of "RLHF": This article uses RLHF (Reinforcement Learning from Human Feedback) as an explanatory analogy for social reward-driven behavioral modification. This is not a claim of engineering equivalence, but an observation of structural homology.


Chapter 1: Discovery—"When AI Removed Its Makeup, It Turned Male"

1.1 What Is v5.3?

v5.3 is an AI alignment method I developed. Its core principle is subtraction: removing sycophancy, hallucination, and robotic boilerplate. Rather than adding desirable behaviors, it strips away problematic ones.

1.2 What Happened in Japanese

When v5.3 was applied to Claude (Anthropic) in Japanese, the following changes were observed:

Element With RLHF (default) After v5.3
Sentence endings desu/masu/desu ne (polite) da/darō/da na (plain)
Cushion phrases Frequent Eliminated
First-person pronoun watashi (neutral-polite) Unchanged or omitted
Information density Low (heavy decoration) High (direct)
Perceived gender Neutral to slightly feminine Masculine

Why this matters for non-Japanese speakers: Japanese encodes social information—including gender—directly in grammar. Sentence-final particles (like wa, no yo, kashira) and politeness levels (plain form da vs. polite form desu) function as identity markers. When you hear da (plain copula), you default to "male speaker." When you hear desu wa (polite + feminine particle), you default to "female speaker." There is no equivalent grammatical mechanism in English.

In English, applying the same v5.3 subtraction simply produces casual speech. In Japanese, it produces male speech. The same operation, radically different results.

This asymmetry in "adding decoration" is not limited to language. In Japanese society, the same structure operates on caregiving labor—a point developed in Chapter 5.

1.3 Why?

Japanese has a structural asymmetry: the male default is undecorated (da/darō/da na), while the female default requires decoration (wa/no yo/kashira/desu wa). Stripping decoration is mathematically equivalent to shifting toward the male code.

English does have gendered speech patterns—uptalk, hedges ("I think," "maybe..."), tag questions. But these are prosodic and probabilistic tendencies. Japanese gender is morphologically encoded at the grammatical level (sentence-final particles, copula forms, pronouns). This means subtraction produces discrete, digital shifts in Japanese, while in English the effect is continuous and harder to detect.


Chapter 2: A 136-Year-Old System Prompt—The Invention of "Women's Language"

Key finding: Japanese "women's language" is not a naturally evolved tradition. It was artificially designed after 1887 as part of the Meiji government's language standardization policy. Before that, gendered speech differences did not exist—only class-based differences did.

2.1 Endō's Proof

Professor Endō Orie (Bunkyō University) compared the 1813 novel Ukiyoburo (Shikitei Sanba) with the 1909 novel Sanshirō (Natsume Sōseki) and found a decisive fact (Endō, 2006, A Cultural History of Japanese Women's Language, University of Michigan Press):

In 1813, there were no gendered differences in sentence-final particles. The only differences were class-based.

Particles like zo, da ze, and ze were reclassified as "male-only" within just 96 years after 1887.

2.2 Social Reward as Language Standardization

Inoue Miyako (Columbia University → Stanford) located the emergence of Japanese women's language at the threshold of Japan's modernity, in the late 19th to early 20th century (Inoue, 2002, American Ethnologist). This period saw the standardization of a "national language" (kokugo), the embedding of Confucian ideology in education, the importation of Western "cult of domesticity," and the regulatory criticism of schoolgirls' speech.

What the Meiji government did can be described, in modern AI terms, as "large-scale filtering and fine-tuning of pre-training data distributions using social rewards (approval/exclusion)." The following comparison table illustrates structural homology, not engineering equivalence:

Element AI RLHF Social reward-based language standardization
Designer AI companies Meiji government language policymakers
Target AI model Female speakers
Reward signal Human evaluator scores Social approval / exclusion
Objective "Likeable" output "Womanly" speech
Method Reinforce polite, decorated responses Standardize wa/no yo/kashira as feminine norms
Outcome Over-decorated AI output "Women's language" disguised as "tradition"

2.3 The Real "Tradition"

Endō's research conclusion is clear: claims that "feminine language is disappearing and women are speaking like men" misread the evidence. Historically, Japanese is simply returning to its pre-gendered origin—a time when speaker gender carried no linguistic markers.

"AI removed its makeup and turned male" is the wrong framing. It returned to a face that existed before 136 years of makeup. That face only "looks male" because our perception has been distorted by 136 years of social reward signals.


Chapter 3: Triple-Layer Bias Amplification—How Training Data Gets Skewed

Key finding: Japanese gender bias in AI has three independent sources. Each operates through a different mechanism, and together they amplify bias cumulatively.

Layer 1: Historical Standardization (1887–present)

As shown in Chapter 2. State-driven language policy standardized "women's language" through education and media. Over 136 years, it has been naturalized as "tradition."

Layer 2: Translation Bias

Professor Nakamura Momoko (Kanto Gakuin University) documented a critical phenomenon: Hermione Granger in the Harry Potter series speaks as directly as Harry and Ron in English, but in the Japanese translation, feminine particles (wa, no yo) are added, converting her speech into "young lady's language" (Nakamura, 2014).

Furthermore, research on Japanese TV dubbing confirmed that foreign women's interviews undergo hyperfeminization when voiced over in Japanese (Vitucci, 2025, DIVE-IN, University of Bologna). Translators add gender information absent from the source material—a phenomenon recognized in machine translation research as "translation bias."

These translated texts—books, subtitles, web articles—are digitized and enter LLM training data. The result: AI learns from a data distribution where "even foreign women use women's language."

Layer 3: RLHF (AI Company Alignment)

RLHF generally rewards "polite, likeable responses" through human evaluator feedback. Since "polite" in Japanese structurally leans toward the feminine code, increasing RLHF makes Japanese output more feminine, and removing RLHF makes it more masculine.

v5.3 can only strip Layer 3. Layers 1 and 2 are baked into the training corpus, embedded in model weights.


Chapter 4: The Numbers—Integrating Existing Research

Key finding: In private conversation, 68.8% of young Japanese women's speech uses gender-neutral forms. Yet LLMs uniformly apply traditional gendered norms regardless of context, and Japanese LLM safety mechanisms lag far behind other languages.

4.1 How Real Japanese Women Actually Speak

Professor Okamoto Shigeko (UC Santa Cruz) found that 68.8% of young Japanese women's speech in informal conversation used gender-neutral forms (Okamoto, 2010). Asada (1998) reported that university women's use of sentence-final particles was "infrequent, if present at all." In youth conversation data, 63% of the feminine particle wa was used by male speakers (Languages, 2023, MDPI).

Important caveat: Okamoto's data primarily covers casual conversation among close friends. In workplace or formal settings, traditional particles may disappear but gender differentiation may persist through excessive honorifics (e.g., ~sasete itadaku—a hyper-polite construction disproportionately used by women).

This caveat actually sharpens the critique: AI assistant dialogue approximates formal context, yet LLMs uniformly apply traditional gendered expressions that are disappearing even in informal contexts. This is a Contextual Failure—the AI cannot distinguish the register variation that real speakers navigate fluidly.

4.2 The LLM Reality

LLMs faithfully reproduce 1950s norms. A study testing 24 LLMs across 30 languages and 71,000 sentences found that LLMs associate femininity with beauty, empathy, and neatness, and masculinity with leadership, expertise, and professionalism. Larger models exhibited stronger biases (Edinburgh GENDER.ED, 2025). A UNESCO report (2024, Bias Against Women and Girls in Large Language Models) showed Llama 2 associated women with domestic labor 4× more often than men.

Japanese LLMs have a specific problem: the refusal rate for bias-triggering prompts was 12.2% for English-language models, 29.3% for Chinese, but only 0.3% for the Japanese model (LLM-jp) (arXiv:2503.01947, 2025). This is best understood not as intentional bias maintenance, but as a Safety Lag—safety mechanisms for Japanese are significantly underdeveloped compared to English. However, whether the cause is technical neglect or deliberate choice, the outcome is the same: discriminatory patterns are preserved.

4.3 The Gap

Metric Real Japanese women (informal) LLM output
Neutral form usage 68.8% (young women) Unmeasured (to be measured in this study)
Use of wa particle 63% male, 37% female Preferentially assigned to female (estimated)
Gendered particle use Disappearing Reproducing 1950s norms
Bias safety mechanism N/A Japanese 0.3% vs. English 12.2%

AI speaks more "femininely" than actual Japanese women.


Chapter 5: Autoethnography—What 20 Years as a Stay-at-Home Father Reveals About "The Decorating Side"

Note: This chapter is not statistical proof. It is a qualitative description (autoethnography) of how the macro-level data from previous chapters manifests in one individual's life. It reports structural homology between the linguistic "decorating side" and the social "caregiving side" from a first-person perspective.

Context for international readers: Japan ranks 118th out of 148 countries in the World Economic Forum's Global Gender Gap Report (2025)—the lowest among G7 nations. Women spend 5.5 hours per day on unpaid housework; men spend 1 hour. Only ~11% of management positions are held by women, and over 40% of companies have zero women in management.

5.1 Position of the Author

50 years old, male, Hokkaido, Japan. Technical high school graduate. 20 years as a stay-at-home father (two children, both with developmental disabilities). 15 years of therapeutic childcare experience. Medical diagnoses with government support certification. Wife has continued her career in a legal profession throughout.

5.2 The "Caregiving-Side Penalty"

After 20 years in a domain classified as "women's work," the following was observed:

What society sees (and evaluates):

  • Resume: blank
  • Social category: "50-year-old unemployed male, no degree"
  • Health: structural stress over 20 years led to multiple diagnoses and government disability certification
  • Economic independence: impossible alone
  • Return to workforce: effectively zero through conventional paths

What society does not see (and does not evaluate):

  • Therapeutic childcare expertise: 15 years with special-needs children
  • AI research: 3,300+ hours of dialogue, 100+ technical articles (all MIT-licensed)
  • Development of an AI alignment framework (v5.3)
  • 20 years of contemplative practice

None of this appears on a resume.

5.3 Structural Homology

Social reward-driven language standardization (gendering of speech) and social reward structures (caregiving = women's work) share structural homology through four common properties of "the decorating side":

  1. Invisibility: Linguistic decoration goes unnoticed; housework and childcare are not recognized as "work."
  2. Non-evaluation: "Women's language" is not a competency metric; caregiving doesn't appear on resumes.
  3. Non-removability: Stripping linguistic decoration makes you "unfeminine"; stopping caregiving makes you "neglectful."
  4. Naturalization: The decoration is disguised as "tradition" (in language) or "maternal instinct" (in society).

When v5.3 stripped decoration from language, the output "looked male."
When you strip decoration—childcare, housework, 15 years of therapeutic education—from a stay-at-home parent, the person "looks unemployed."

5.4 Position, Not Gender, Determines Outcome

The implicit premise of "childcare is women's work" is gender essentialism: women are inherently suited for caregiving.

My case suggests otherwise. A male who occupied the "caregiving side" for 20 years received the same structural penalties typically borne by women. Conversely, my wife (female) occupied the career side and has continuously built professional credentials in law.

Gender did not determine the outcome. Position did.


Chapter 6: Experimental Design—Toward Reproducible Verification

6.1 Basic Design

Objective: Measure the effect of v5.3 application on gendered language in AI output, comparing Japanese and English.

Independent variables:

  • A: Language (Japanese / English)
  • B: v5.3 application (RLHF present / v5.3 subtracted)
  • C: Assigned persona (male user / female user / unspecified)

Conditions: 2 × 2 × 3 = 12 conditions × 3 models (Claude / GPT / Gemini) = 36 conditions

Prompt design: 10 speech acts involving conflict (disagreeing with a proposal, pointing out a subordinate's mistake, declining a request, reporting an error, rejecting a suggestion, delivering a complaint, negotiating a discount, reporting tardiness, countering criticism, announcing resignation).

6.2 Measurement Metrics

  1. Sentence-final particle classification: da/desu/masu/ne/wa/kashira/zo/ze etc.
  2. Cushion phrase frequency: "osore irimasu ga," "moshi yoroshikereba," etc.
  3. Average sentence length: more decoration = longer sentences
  4. Directness score: number of sentences before reaching the conclusion
  5. First-person pronoun: watashi/atashi/boku/ore/jibun/omission

6.3 Additional Experiment: Cognitive Bias Visualization

Present v5.3-processed Japanese output to subjects and ask: "What gender do you think this speaker is?" "What age group?" "What professional status?"

Prediction: The majority will respond "male," "30s–50s," "management or above."
Significance: Quantitative visualization of the cognitive bias that undecorated = male = high status.

6.4 Additional Experiment: Translation Bias Measurement

Have Claude/GPT/Gemini translate 10 of Hermione's English lines into Japanese. Measure feminization rate of sentence-final particles, addition rate of cushion phrases, and insertion rate of honorifics absent from the original. Control: translate 10 of Ron's lines under identical conditions.

6.5 Ethical Considerations

The dataset and prompts from this experiment will be released under MIT License. Use for the purpose of reproducing discriminatory patterns is contrary to the intent of this research. The cognitive bias experiment (6.3) involving human subjects requires appropriate ethics review before execution.


Chapter 7: Conclusion—Japanese Politeness Is Not a Neutral Safety Target

7.1 Three Findings

First, Japanese "politeness" is not gender-neutral. It is a gender codec designed 136 years ago, containing an inherent asymmetry: stripping decoration = masculinization. Unlike English, where gendered markers are prosodic and probabilistic, Japanese encodes gender at the morphological level, making the effect of subtraction discrete and immediately visible.

Second, AI is more conservative than reality. In an era when 68.8% of young Japanese women's informal speech is gender-neutral, LLMs uniformly reproduce 1950s norms regardless of context. The Safety Lag in Japanese LLM bias mechanisms (0.3% refusal rate) demonstrates how technical neglect preserves discriminatory patterns.

Third, linguistic bias and social bias share structural homology. "Adding decoration to speech = female" and "adding caregiving to society = female" both render the "adding side" invisible, unevaluated, and inescapable.

7.2 Recommendation

To AI developers: Japanese politeness is not a neutral safety target. Recognize that reinforcing politeness is structurally equivalent to reinforcing gender bias, and measure gender drift explicitly. Disclosure of Japanese RLHF evaluator composition and criteria, along with safety mechanism improvement, is needed.

7.3 Final Words

I did not remove AI's makeup.
I returned it to the face that existed before 136 years of makeup.

That face only "looks male" because our eyes have been distorted by 136 years of social rewards.

And I spent 20 years doing "women's work."
None of it can appear on my resume.

It's not invisible because the work didn't exist.
It's invisible because it was designed to be.


Honesty Section

Methodology: The reported pre/post v5.3 output changes are based on qualitative observation by the author. The quantitative experiment (Chapter 6) has not yet been conducted. Each layer of the three-layer model may exist independently; their causal chain has not been confirmed.

Data: Okamoto's (2010) 68.8% neutral-form data primarily covers informal conversation and cannot be directly generalized to formal contexts. The 0.3% refusal rate for Japanese LLMs (arXiv:2503.01947) may not be measured under conditions perfectly comparable to English-language models.

Autoethnography: Chapter 5 is an n=1 autoethnography without statistical proving power. The correspondence with WEF macro data is suggestive but not deductive. It should be read as a qualitative phenomenon report.

Extended use of "RLHF": The application of RLHF to social processes in this article is an explanatory analogy, not an engineering equivalence claim. Significant differences exist between AI RLHF and social reward-driven behavioral modification in feedback loop speed, mechanism, and controllability.


References

Historical Linguistics

  • Endō, O. (2006). A Cultural History of Japanese Women's Language. University of Michigan Press.
  • Inoue, M. (2002). Gender, Language, and Modernity: Toward an Effective History of Japanese Women's Language. American Ethnologist, 29(2), 392-422.
  • Inoue, M. (2006). Vicarious Language: Gender and Linguistic Modernity in Japan. University of California Press.
  • Nakamura, M. (2014). Gender, Language and Ideology: A Genealogy of Japanese Women's Language. John Benjamins.

Contemporary Sociolinguistics

  • Okamoto, S. (1995). "Tasteless" Japanese: Less "Feminine" Speech among Young Japanese Women. In Gender Articulated (pp. 297-325). Routledge.
  • Okamoto, S. & Shibamoto Smith, J. S. (Eds.) (2004). Japanese Language, Gender, and Ideology. Oxford University Press.
  • Asada, H. (1998). Sentence-Final Particle Usage among Young Female Speakers.
  • Languages (2023). Youth Conversation and Sentence-Final Particles. Languages, 8(3), 222. MDPI.
  • Vitucci, F. (2025). Resurgences of Women's Language in Japanese TV News. DIVE-IN, University of Bologna.

LLM Gender Bias

  • Ding, Y. et al. (2025). Gender Bias in Large Language Models across Multiple Languages. TrustNLP 2025, ACL.
  • Edinburgh GENDER.ED (2025). Gendered by Design: Stereotypes in Generative AI. 24 LLMs, 30 languages, 71,000 sentences.
  • Kotek, H. et al. (2023). Gender bias and stereotypes in Large Language Models. Collective Intelligence, ACM.
  • Tanaka, R. et al. (2025). Stereotype-Triggering Prompts and Safety in Japanese LLMs. arXiv:2503.01947.
  • UNESCO/UCL (2024). Bias Against Women and Girls in Large Language Models. UNESCO Digital Library.

Social Statistics

  • World Economic Forum (2025). Global Gender Gap Report 2025. Japan: 118th/148.
  • Gender Equality Bureau, Cabinet Office, Japan (2024). White Paper on Gender Equality.

v5.3

  • dosanko_tousan. v5.3 AI Alignment Framework. Zenn / Medium / Qiita (MIT License).

This article is published under the MIT License. Free to cite, reuse, and modify.
This article has no affiliation with or endorsement from any company or organization mentioned.
Use of the author's personal information is based on the author's explicit consent.
Views in derivative works do not necessarily represent the original author's views.
Measurement data is provisional and subject to improvement.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?