More than 1 year has passed since last update.

M1MAX でStable Diffusionを動かしてみたかった話

Last updated at 2022-09-19Posted at 2022-09-18

M1/M2などのApple Silicon搭載のMacで生成モデルStable Diffusionを実装してみたので共有します。

この記事は、Stable DiffusionはPython3.8でなければならない部分やM1特有の問題がありますが、
Anacondaも使わずにPython3.9系で頑張って動かすと言う涙なみだのドキュメンタリーです。

動作させるMacの情報

今回使用するMacは
14インチMacBook Pro 2021
M1MAX 10コアCPU 24コアGPU
メモリ32GB
となります。

stable-diffusionのリポジトリをクローンしましょう。

git clone https://github.com/CompVis/stable-diffusion

stable-diffusionフォルダに移動します

cd stable-diffusion

https://huggingface.co/CompVis/stable-diffusion-v-1-4-original
にアクセスして、ユーザー登録して
Download the weightsの部分からモデルをダウンロードします。

rustを使えるようにインストールします。

brew install rust

Rust（ラスト）は、性能、メモリ安全性、安全な並行性を目指して設計されたマルチパラダイムのプログラミング言語である。
だそうです(　･`ω･´)ｷﾘｯ　うんわからん

pip install torch torchvision torchaudio
pip install albumentations opencv-python pudb invisible-watermark imageio imageio-ffmpeg pytorch-lightning omegaconf test-tube streamlit einops torch-fidelity transformers torchmetrics kornia
pip install -e "git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers"
pip install -e "git+https://github.com/openai/CLIP.git@main#egg=clip"
pip install -e ./stable-diffusion

エラー排除で試行錯誤

まったくうごかない、試行錯誤しますがdlib関連でエラーが続いています。
pip installでちゃんと入っているのに、importができない。
下記の通り、ImportErrorが出てしまいます。ぐももももも。

>>> import dlib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/lib/python3.9/site-packages/dlib/__init__.py", line 19, in <module>
    from _dlib_pybind11 import *
ImportError: dlopen(/opt/homebrew/lib/python3.9/site-packages/_dlib_pybind11.cpython-39-darwin.so, 0x0002): symbol not found in flat namespace (_png_do_expand_palette_rgb8_neon)

dlib関連で詰んでしまった。。。
ではHomeBrewでDlibをインストールします。
pip でのインストールパッケージが合わないらしい。
そのあとにpip cache purgeでキャッシュを削除します。

brew install dlib
pip cache purge

キャッシュが削除できたら改にdlibをpip installします。

pip install dlib

疲れてきました。。。。

No module named 'ldm.util'; 'ldm' is not a package

まだエラーがとれません。

これでどうだ！！！

stable-diffusion % python scripts/txt2img.py --prompt "a red juicy apple floating in outer space, like a planet" --n_samples 1 --n_iter 1 --plms
Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 342/342 [00:00<00:00, 122kB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.55k/4.55k [00:00<00:00, 1.27MB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.22G/1.22G [00:40<00:00, 30.4MB/s]
Global seed set to 42
Traceback (most recent call last):
  File "/Users/satoshi/git/stable-diffusion/scripts/txt2img.py", line 344, in <module>
    main()
  File "/Users/satoshi/git/stable-diffusion/scripts/txt2img.py", line 240, in main
    model = load_model_from_config(config, f"{opt.ckpt}")
  File "/Users/satoshi/git/stable-diffusion/scripts/txt2img.py", line 50, in load_model_from_config
    pl_sd = torch.load(ckpt, map_location="cpu")
  File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 699, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/opt/homebrew/lib/python3.9/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/ldm/stable-diffusion-v1/model.ckpt'

少し進みましたね。
モデルがないと言っていますね。
指定されたフォルダを作成します。

mkdir -p models/ldm/stable-diffusion-v1/

ほかの参考記事を見ると序盤でダウンロードしたモデルをこのフォルダにリンクを貼るようですが、
面倒なので、このフォルダにmodel.ckptとして保存します。

stable-diffusion % ls ./models/ldm/stable-diffusion-v1/           
model.ckpt

今回は、「sd-v1-4-full-ema.ckpt」をリネームして「model.ckpt」としました。

さぁリベンジです！！

python scripts/txt2img.py --prompt "a red juicy apple floating in outer space, like a planet" --n_samples 1 --n_iter 1 --plms

進みそうな気配がありますが、、、

AttributeError: module 'ldm.modules.encoders.modules' has no attribute 'FrozenCLIPEmbedder'

FrozenCLIPEmbedderというclassがないよと言われてしまいました。
それならばclass追加しちまおう。

from transformers import CLIPTokenizer, CLIPTextModel
class FrozenCLIPEmbedder(AbstractEncoder):
    """Uses the CLIP transformer encoder for text (from Hugging Face)"""
    def __init__(self, version="openai/clip-vit-large-patch14", device="cuda", max_length=77):
        super().__init__()
        self.tokenizer = CLIPTokenizer.from_pretrained(version)
        self.transformer = CLIPTextModel.from_pretrained(version)
        self.device = device
        self.max_length = max_length
        self.freeze()

さぁいでよ！！！

python scripts/txt2img.py --prompt "a red juicy apple floating in outer space, like a planet" --n_samples 1 --n_iter 1 --plms

M1対策をわすれていました。macはcudaがつかえないので、
txt2img.pyについて以下の3箇所を書き換えます

    model.to("mps")
    model.eval()
    return model

    device = torch.device("mps") if torch.backends.mps.is_available() else torch.device("cpu")
    model = model.to(device)

        with nullcontext("mps"):
            with model.ema_scope():

ldm/models/diffusion/plms.py

    def register_buffer(self, name, attr):
        if type(attr) == torch.Tensor:
            #if attr.device != torch.device("cuda"):
            #    attr = attr.to(torch.device("cuda"))
            if attr.device != torch.device("mps"):
                attr = attr.to(torch.float32).to(torch.device("mps")).contiguous()

M1使うにはPYTORCH_ENABLE_MPS_FALLBACK=1って頭につけるといいらしい。。

PYTORCH_ENABLE_MPS_FALLBACK=1 python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

さて最後の難関でしょうか。。。

torch/nn/functional.pyを下記の通り書き換えます

    if has_torch_function_variadic(input, weight, bias):
        return handle_torch_function(
            layer_norm, (input.contiguous(), weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
        )
    #return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
    return torch.layer_norm(input.contiguous(),normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

ラストチャレンジ

PYTORCH_ENABLE_MPS_FALLBACK=1 python scripts/txt2img.py --prompt "a red juicy apple and cat" --plms

ようやくうごきましたー。
あとはパラメータの調整とかですかね。
6分割だとサンプル画像がでてきますねー。

catsにすると、

おもろい画像になりました。

参考：Stable-Diffusion を適当に変換試行する
そのほか、参考記事
https://zenn.dev/ktakayama/articles/6c627e0956f32c
https://github.com/davisking/dlib/issues/2268#issuecomment-800792633

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up