More than 1 year has passed since last update.

KerasCV を使って 2行で StableDiffusion を使う

Posted at 2022-10-02

すごく些末な内容なのですが、便利なのにまだ投稿されていなかったので記事を書きます。

こちらのツイートで共有されている通り、すごく簡単に使えるようになったよ！という話ですね。keras 版 StableDiffusion のモデルが huggingface に上げられていて、それを使う仕組みになっているようです。

以下のコードを実行することで簡単に StableDiffusion を扱えます。import 文や画像を表示する部分のコードを除けば 2行なので許してほしい...

import cv2
from keras_cv.models import StableDiffusion
from matplotlib import pyplot as plt
from PIL import Image

model = StableDiffusion(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=100,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")

img2 = cv2.imread("house.png")
src = cv2.cvtColor(img2,cv2.COLOR_BGR2RGB)
plt.imshow(src)
plt.axis("off")
plt.show()

以下は実行する際の注意点です。

prompt の最大長は 77 token となっています。(refs. https://tech.isid.co.jp/entry/2022/09/27/Stable_Diffusion入門-長い呪文は切り捨てられる)
num_steps 100 は CPU だと 1時間くらいかかるのでご注意ください。(colaboratory で確認しました)。GPU だと2分くらいです。
現時点(2022/10)での colaboratory は tensorflow 2.8 で、この機能を使うにはバージョンが足りないので (2.9以上から)、 !pip install tensorflow keras-cv -U を実行してからやってみてください。
CPUで実行する際は何もエラーは出ませんが、GPUに変更した際、Original error: UNIMPLEMENTED: DNN library is not found. と怒られるかもしれません。 cudnn 絡みの問題で、 !apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2 を実行してランタイムを再起動すると動くようになると思います。
- refs. https://stackoverflow.com/questions/71000120/colab-0-unimplemented-dnn-library-is-not-found
画像を表示する際、 opencv では BGR で画像が扱われているため、RGB に変換する必要があります

それでは好きな prompt を実験してみた様子を貼ってこの記事の締めくくりとします。

the living room of a cozy wooden house with a fireplace, at night, interior design, d & d concept art, d & d wallpaper, warm, digital art. art by james gurney and larry elmore.

city made out of glass : : close shot : : 3 5 mm, realism, octane render, 8 k, exploration, cinematic, trending on artstation, realistic, 3 5 mm camera, unreal engine, hyper detailed, photo – realistic maximum detail.

An expressive oil painting of a tennis player smashing, depicted as an explosion of a nebula

どれも大体いい感じですね。これを使って prompt を自動で集めてひたすら画像を生成し、自宅のデジタルフォトフレームに延々流したらすごく生活が豊かになりそう。そんな GPU 持っていたらもっと別のことに使うべきな気がしますが...

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up