fish-speech のセットアップ

fish-speech

Posted at 2025-06-05

Ubuntu環境で fish-speech をセットアップ手順をまとめます。
実行ディレクトリ構成、仮想環境、モデル導入、WebUI・CLI両対応、動作確認まで網羅しています。

⸻

🐟 fish-speech セットアップ＆CLI音声合成完全手順書（Ubuntu）

⸻

✅ 0. 作業ディレクトリの作成と移動

mkdir -p ~/workspace/fish-speech-project
cd ~/workspace/fish-speech-project

⸻

✅ 1. Python仮想環境の作成と有効化

python3 -m venv venv
source venv/bin/activate

⸻

✅ 2. システムパッケージのインストール（初回のみ）

sudo apt update
sudo apt install -y libsox-dev ffmpeg build-essential cmake
libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0

⸻

✅ 3. requirements.txt の作成と依存インストール

プロジェクト直下に requirements.txt を作成：

torch==2.4.1
torchvision==0.19.1
torchaudio==2.4.1
huggingface_hub

annotated-types==0.7.0
certifi==2025.1.31
cffi==1.17.1
comtypes==1.4.6
cryptography==44.0.1
numpy
openai
opuslib==3.0.1
paho-mqtt==2.1.0
psutil==7.0.0
PyAudio==0.2.14
pycaw==20240210
pynput
pyperclip==1.9.0
pypinyin==0.53.0
requests==2.32.3
vosk==0.3.45
webrtcvad-wheels==2.0.14
websockets==11.0.3
colorlog==6.9.0
soundfile>=0.12.1
pygame==2.6.1
scipy

インストール実行：

pip install -r requirements.txt

⸻

✅ 4. fish-speech 本体をクローン＆インストール

git clone https://github.com/fishaudio/fish-speech.git
cd fish-speech
pip install -e .
cd ..

ディレクトリ構成：
~/workspace/fish-speech-project/fish-speech

⸻

✅ 5. モデルのダウンロード（HuggingFace）

pip install huggingface_hub # 未インストール時
huggingface-cli login # 初回のみトークン入力
huggingface-cli download fishaudio/fish-speech-1.5 --local-dir fish-speech/checkpoints/fish-speech-1.5

⸻

✅ 6. CLIで音声合成を実行（TTS）

cd fish-speech
python -m tools.run_tts_cli
--text "こんにちは、これは fish-speech のテストです。"
--llama-checkpoint-path checkpoints/fish-speech-1.5
--decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth
--decoder-config-name firefly_gan_vq
--output output.wav

⸻

✅ 7. 出力音声の再生（任意）

ffplay -autoexit output.wav # または aplay output.wav

⸻

✅ 8. WebUI での動作確認（GUIで音声入力も可）

python -m tools.run_webui
--llama-checkpoint-path checkpoints/fish-speech-1.5
--decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth
--decoder-config-name firefly_gan_vq

ブラウザでアクセス：

⸻

✅ 補足（スクリプト化例）

run_tts.sh として保存：

#!/bin/bash
source ~/workspace/fish-speech-project/venv/bin/activate
cd ~/workspace/fish-speech-project/fish-speech

TEXT="$1"

python -m tools.run_tts_cli
--text "$TEXT"
--llama-checkpoint-path checkpoints/fish-speech-1.5
--decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth
--decoder-config-name firefly_gan_vq
--output output.wav

ffplay -autoexit output.wav

実行方法：

chmod +x run_tts.sh
./run_tts.sh "おはようございます、今日は元気ですか？"

⸻

この手順に従えば、CLI/GUIどちらでも fish-speech を使って日本語音声合成が可能です。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up