Ubuntu上のFirefoxで日本語を音声合成

Last updated at 2025-12-06Posted at 2025-11-29

追記：次の記事ではより機能のある設定方法を紹介しています。

動機

UbuntuのFirefoxで Web Speechi API を利用すると、日本語の文章の漢字が1文字ずつ"chinese letter"と発話され、辛い。
Ubuntuの初期状態で、以下を実行すると"chinese letter chinese letter"と発話される。

spd-say "月火"

ということで調査したら、以下の手続きで解決した。

利用する技術

Web Speech API：ブラウザから発話を制御する Javascript の API。 window.speechSynthesisから利用する。
speech-dispatcher：Linux系でブラウザなどの発話リクエストを振り分ける機能。
OpenJTalk：発話する機能。

検証環境

Ubuntu：25.10 amd64
Firefox：145.0.2 Snap版

設定方法

以下をインストールする。依存パッケージの一覧が出てきたらインストールするようにしてください。

sudo apt install speech-dispatcher-openjtalk

ログインユーザの設定一式を作成する。

spd-conf -ucn

コメント化する。

~/.config/speech-dispatcher/speechd.conf

#AudioOutputMethod "pulse"

追加する。

~/.config/speech-dispatcher/speechd.conf

AddModule "openjtalk" "sd_openjtalk" "openjtalk.conf"

ログインユーザのサービスを再読み込みする。

systemctl --user reload speech-dispatcher.service

結果を確認する。以下を実行すると"つきひ"と発話されるはず。同様にブラウザ経由でも日本語が発話された。

spd-say "月火"

音声を変える

OpenJTalkの規定の音声データは句読点なし18文字以上で発話がおかしくなるようなので、別の音声データを取得し設定する。

ブラウザなどでダウンロードする。
https://sourceforge.net/projects/mmdagent/files/MMDAgent_Example/MMDAgent_Example-1.8/MMDAgent_Example-1.8.zip

zipを展開し、設定内に配置する。

mkdir ~/.config/speech-dispatcher/hts-voice
mv ~/MMDAgent_Example-1.8/Voice/mei ~/.config/speech-dispatcher/hts-voice/
mv ~/MMDAgent_Example-1.8/Voice/takumi ~/.config/speech-dispatcher/hts-voice/

openjtalkの規定の設定を、ログインユーザ側に複製する。

cp /etc/speech-dispatcher/modules/openjtalk.conf ~/.config/speech-dispatcher/modules/

規定の音声データをコメント化

~/.config/speech-dispatcher/modules/openjtalk.conf

# OpenjtalkVoice "/usr/share/hts-voice/nitech-jp-atr503-m001/nitech_jp_atr503_m001.htsvoice"

音声データを行コメント状態で追加し、好みの音声データを有効にする。

~/.config/speech-dispatcher/modules/openjtalk.conf

#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/mei/mei_normal.htsvoice"
OpenjtalkVoice ".config/speech-dispatcher/hts-voice/mei/mei_happy.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/mei/mei_angry.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/mei/mei_sad.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/mei/mei_bashful.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/takumi/takumi_normal.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/takumi/takumi_happy.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/takumi/takumi_sad.htsvoice"
#OpenjtalkVoice ".config/speech-dispatcher/hts-voice/takumi/takumi_angry.htsvoice"

ログインユーザのサービスを再読み込みする。

systemctl --user reload speech-dispatcher.service

結果を確認する。以下を実行すると別の声で"つきひ"と発話されるはず。

spd-say "月火"

課題

speech-dispatcher-openjtalkのソースを見た感じ、現時点では複数声質・音量・速さを Web Speech API から指定できない模様。
多言語の振り分け指定は未対応。speechd.conf の設定を極めないといけなさそう。

結語

開発維持保守関連の皆様ありがとうございます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up