Bert-VITS2のめんどくさい前処理をシェルスクリプトにまとめた【音声合成】

Posted at 2023-12-16

litaginさんの解説記事を参考にBert-VITS2 Ver2.1のつくよみちゃんJVSコーパスを学習するシェルスクリプトを作成しました。ver2.1なので需要がないかもしれませんが、これを実行するだけでとりあえず学習が回ります！

環境構築

以下を前提条件とします。

linuxもしくはwsl2
CUDAドライバがインストール済み
python3.9以上がインストール済み

git clone

リポジトリをcloneして、依存ライブラリをpip installするところまで

git clone https://github.com/fishaudio/Bert-VITS2
cd Bert-VITS2/
git checkout 2.1
pip3 install torch torchvision torchaudio
pip install -r requirements.txt

事前準備

つくよみちゃんJVSコーパスでセットアップするシェルスクリプトを作成します。
preprocess.shという名前のファイルを作成し、以下を張り付けて保存してください。

> preprocess.sh # 空のファイルを作成

preprocess.sh

model_name=tsukuyomi-chan

# download pretrain models
[ ! -e "Data/${model_name}/models/" ] && mkdir -p Data/${model_name}/models/
[ ! -e "Data/${model_name}/models/DUR_0.pth" ] && wget -O "DUR_0.pth" https://huggingface.co/Garydesu/bert-vits2_base_model-2.1/resolve/main/DUR_0.pth?download=true
[ ! -e "Data/${model_name}/models/D_0.pth" ] && wget -O "D_0.pth" https://huggingface.co/Garydesu/bert-vits2_base_model-2.1/resolve/main/D_0.pth?download=true
[ ! -e "Data/${model_name}/models/G_0.pth" ] && wget -O "G_0.pth" https://huggingface.co/Garydesu/bert-vits2_base_model-2.1/resolve/main/G_0.pth?download=true

mv DUR_0.pth D_0.pth G_0.pth Data/${model_name}/models/
# download & unzip tsukuyomi-chan corpus
if [ ! -e "つくよみちゃんコーパス Vol.1 声優統計コーパス（JVSコーパス準拠）" ];
then
    if [ ! -e "sozai-tyc-corpus1.zip" ];
    then
        wget https://tyc.rei-yumesaki.net/files/sozai-tyc-corpus1.zip
        unzip sozai-tyc-corpus1.zip
    else
        unzip sozai-tyc-corpus1.zip
    fi
fi

# prepare wav file and text file
[ ! -e "Data/${model_name}/audios/raw" ] && mkdir -p Data/${model_name}/audios/raw
[ ! -e "Data/${model_name}/filelists" ] && mkdir -p Data/${model_name}/filelists
cp "つくよみちゃんコーパス Vol.1 声優統計コーパス（JVSコーパス準拠）/おまけ：WAV（+12dB増幅＆高音域削減）/WAV（+12dB増幅＆高音域削減）"/* "Data/${model_name}/audios/raw/"

[ -e "filelists/text.list" ] && rm filelists/text.list
while read line
do
    file_name=${line%:*}
    text=${line#*:}
    data_dir="Data/tsukuyomi-chan/audios/wavs"
    file_path=${data_dir}/${file_name}.wav
    lang="JP"
    echo "${file_path}|${model_name}|${lang}|${text}" >> Data/${model_name}/filelists/text.list
done < "つくよみちゃんコーパス Vol.1 声優統計コーパス（JVSコーパス準拠）/04 台本と補足資料/★台本テキスト/01 補足なし台本（JSUTコーパス・JVSコーパス版）.txt"

# prepare config file
python preprocess_text.py
base_config=$(cat config.yml)
new_config=$(echo "$base_config" | \
sed -e "s|dataset_path: \"[^\"]*\"|dataset_path: \"Data/${model_name}\"|" | \
sed -e "s|transcription_path: \"[^\"]*\"|transcription_path: \"filelists/text.list\"|" | \
sed -e "s|config_path: \"[^\"]*\"|config_path: \"config.json\"|")



echo "$new_config" > config.yml

base_conf_json=$(cat configs/config.json)
echo "$base_conf_json" | sed -e "s|\"batch_size\": [0-9]*|\"batch_size\": 4|" > Data/${model_name}/config.json

# execute prepare python script
python resample.py
python preprocess_text.py

# prepare fiture files
python bert_gen.py
python emo_gen.py

作成したら、実行権限を付与して実行してください。

chmod +x preprocess.sh
./preprocess.sh

学習

以下のコマンドで学習を開始します。

python train_ms.py

推論

config.ymlの100行目あたりにモデルのパスを指定する箇所があるので、models/G_1000.pth(数字の部分は適宜変更してください)と書きかえます。

webui:
  # 推理设备
  device: "cuda"
  # 模型路径
  model: "genshin/models/G_8000.pth"　-> "models/G_1000.pth"に変更
...

webuiを起動します

python webui.py

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up