More than 1 year has passed since last update.

写真からその人のいろんな画像を生成できるAI

Last updated at 2024-02-16Posted at 2024-02-16

その人の顔でなんでも指定できる

とりあえずこの人で試してみる。

誰か。フリー素材のジェフさんである。
写真は一枚でもいい。
この人の顔で、こういう写真を欲しいという要望を文章で入力する。

海で着物を着て蛇をジャグリングをしている男

楽しげな画像が生成される。

Photo Makerというモデルでできる。
オープンソースです。

使い方

インストール

pip install diffusers
pip install git+https://github.com/TencentARC/PhotoMaker.git
pip install accelerate
git clone https://github.com/TencentARC/PhotoMaker.git
cd PhotoMaker/

モデルパイプライン初期化

import torch
import numpy as np
import random
import os
from PIL import Image

from diffusers.utils import load_image
from diffusers import EulerDiscreteScheduler, DDIMScheduler
from huggingface_hub import hf_hub_download

from photomaker import PhotoMakerStableDiffusionXLPipeline

base_model_path = 'SG161222/RealVisXL_V3.0'
device = "cuda"
photomaker_ckpt = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")

pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.bfloat16,
    use_safetensors=True,
    variant="fp16",
).to(device)

pipe.load_photomaker_adapter(
    os.path.dirname(photomaker_ckpt),
    subfolder="",
    weight_name=os.path.basename(photomaker_ckpt),
    trigger_word="img"
)
pipe.id_encoder.to(device)


#pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
#pipe.fuse_lora()

pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
# pipe.set_adapters(["photomaker"], adapter_weights=[1.0])
pipe.fuse_lora()

実行

promptに生成したい画像のテキストを入れて実行する。

input_folder_name = 'jeff' # input images directory
image_basename_list = os.listdir(input_folder_name)
image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])

input_id_images = []
for image_path in image_path_list:
    input_id_images.append(load_image(image_path))

## Note that the trigger word `img` must follow the class word for personalization
prompt = "close up portrait, a man img wearing a kimono and juggling snakes, in the sea, face, high quality, film grain"
negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch)"
generator = torch.Generator(device=device).manual_seed(42)

## Parameter setting
num_steps = 50
style_strength_ratio = 30
start_merge_step = int(float(style_strength_ratio) / 100 * num_steps)
if start_merge_step > 30:
    start_merge_step = 30

images = pipe(
    prompt=prompt,
    input_id_images=input_id_images,
    negative_prompt=negative_prompt,
    num_images_per_prompt=4,
    num_inference_steps=num_steps,
    start_merge_step=start_merge_step,
    generator=generator,
).images

save_path = "./outputs"
os.makedirs(save_path, exist_ok=True)
for idx, image in enumerate(images):
    image.save(os.path.join(save_path, f"photomaker_{idx:02d}.png"))

PIL Imageとして出力される。

完全に猫人間になった人

コラボ：
https://colab.research.google.com/drive/1KPj3jphUTxpVaqLu0yfZKsoprMyl0x6M?usp=sharing

🐣

フリーランスエンジニアです。
AIについて色々記事を書いていますのでよかったらプロフィールを見てみてください。

もし以下のようなご要望をお持ちでしたらお気軽にご相談ください。
AIサービスを開発したい、ビジネスにAIを組み込んで効率化したい、AIを使ったスマホアプリを開発したい、
ARを使ったアプリケーションを作りたい、スマホアプリを作りたいけどどこに相談したらいいかわからない…

いずれも中間コストを省いたリーズナブルな価格でお請けできます。

お仕事のご相談はこちらまで
rockyshikoku@gmail.com

機械学習やAR技術を使ったアプリケーションを作っています。
機械学習／AR関連の情報を発信しています。

X
Medium
GitHub

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up