@takutoo068posted at 2024-10-03

Hugging face phonemizerが機能せずエラーが出る。

Q&A

Closed

Python 初心者 Torch GoogleColaboratory huggingface

解決したいこと

Hugging faceのPipelineによる音声生成をしたいのですが、phonemizerがインストールされていない（？）というエラーが起きました。解決方法を教えて頂きたいです。

お借りしたモデル（Hugging faceより）

anhnct/audioldm2_gigaspeech

実行環境（関係ありそうなやつのみ）

・OS:Linux x86_64 Ubuntu 22.04.3 LTS（google colab上）、windows11
・python：3.10
・cuda:12.2

・espeak-phonemizer-windows 1.0.4
・phonemizer 3.3.0

・torch: 2.4.1+cu121
・torchaudio: 2.4.1+cu121
・torchsummary: 1.5.1
・torchvision: 0.19.1+cu121
・scipy: 1.13.1
・diffusers: 0.30.3

実行したコード（google colab）

from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("anhnct/audioldm2_gigaspeech")

prompt = "An female actor say with angry voice"
transcript = "wish you have a good day, i hope you never forget me"
negative_prompt = "low quality"


audio = pipe(prompt,transcript).audio[0]

発生している問題・エラー

Collecting espeak-phonemizer-windows
  Downloading espeak_phonemizer_windows-1.0.4-py3-none-any.whl.metadata (2.5 kB)
Downloading espeak_phonemizer_windows-1.0.4-py3-none-any.whl (9.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.4/9.4 MB 2.7 MB/s eta 0:00:00
Installing collected packages: espeak-phonemizer-windows
Successfully installed espeak-phonemizer-windows-1.0.4
vae/diffusion_pytorch_model.safetensors not found
Loading pipeline components...: 100%
 11/11 [00:03<00:00,  5.08it/s]
An error occurred while trying to fetch /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/unet.
Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
An error occurred while trying to fetch /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/projection_model: Error no file named diffusion_pytorch_model.safetensors found in directory /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/projection_model.
Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
An error occurred while trying to fetch /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/vae: Error no file named diffusion_pytorch_model.safetensors found in directory /root/.cache/huggingface/hub/models--anhnct--audioldm2_gigaspeech/snapshots/c812a7861f38a69441a8e0428438e782d9864614/vae.
Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-22-8b84e53bddb4> in <cell line: 15>()
     13 
     14 
---> 15 audio = pipe(prompt,transcript).audio[0]

9 frames
/usr/local/lib/python3.10/dist-packages/transformers/models/vits/tokenization_vits.py in prepare_for_tokenization(self, text, is_split_into_words, normalize, **kwargs)
    188         if self.phonemize:
    189             if not is_phonemizer_available():
--> 190                 raise ImportError("Please install the `phonemizer` Python package to use this tokenizer.")
    191 
    192             filtered_text = phonemizer.phonemize(

ImportError: Please install the `phonemizer` Python package to use this tokenizer.```

自分で試したこと

!pip install datasets transformers
!pip install phonemizer
!apt-get install espeak

参考ページ

!pip install espeak-phonemizer-windows

参考ページ

最後に

初心者のため、使い勝手がわからず、情報不足かもしれません。
これだけでは状況がわからない！もっと情報を提供してくれないと解決できない！などなどありましたら、お手数ですがコメントにてご指摘いただけるとありがたいです。どうかよろしくお願いします。

0 likes

Are you sure you want to delete the question?