python の fasttext-langdetect による言語判定

Posted at 2026-06-03

やりたいこと

テキストが日本語、英語など、どの言語の文字列かを python で判定する。

fasttext-langdetect

高速、高精度の言語判定ライブラリ。

インストール

uv add fasttext-langdetect

プログラム例

from ftlangdetect import detect

texts = [
    "こんにちは、これはテストです。",
    "こんにちは、これは test です。",
    "Hello, this is a test.",
    "Bonjour, c'est un test.",
]

for text in texts:
    result = detect(text=text)
    print(f"{text}\t{result}")

上記のプログラムの実行結果は以下の通り。

こんにちは、これはテストです。  {'lang': 'ja', 'score': 1.0}
こんにちは、これは test です。  {'lang': 'ja', 'score': 1.0}
Hello, this is a test.  {'lang': 'en', 'score': 0.9535946846008301}
Bonjour, c'est un test. {'lang': 'fr', 'score': 0.9892534017562866}

「こんにちは、これは test です。」では英単語が含まれていても日本語として判定されている。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up