More than 1 year has passed since last update.

ChatGPTで執行役員を勝手に遊戯王カードにしてみた

Posted at 2023-04-27

誰しもが人生で1回ぐらい遊戯王のクソコラオリカぐらい作るだろう。

　なんやかんや現役のデュエリストではないのですが、ちょいちょいと実況を見る程度には、遊戯王好きではありまして。そんな初代遊戯王世代の人たちは、コンピュータでちょいちょいとペイント的なソフトが使えたら、一度ぐらいは他人の顔写真をクソコラしてオリジナルカードにしているはずです(断言)
　そんなわけで、会社の執行役員の画像を勝手に人工知能に処理させて、遊戯王のオリジナルカードを作ってしまおう。という試みです。
　今回は、毎週木曜日にYouTubeで生放送ラジオをしているCHROの水谷を†闇のゲーム†の生贄にしようと思います。

Yu-Gi-Oh! API

世の中には変わった人がいまして、遊戯王のカードのデータが集約されて、REST APIとして取得できるサービスを公開している人が居ます。

例えば、みんな大好きシーホース(英語名：Tatsunootoshigo)はこんな感じの情報で取れます。

$ curl https://db.ygoprodeck.com/api/v7/cardinfo.php?fname=Tatsunootoshigo | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   696    0   696    0     0   1784      0 --:--:-- --:--:-- --:--:--  1789
{
  "data": [
    {
      "id": 47922711,
      "name": "Tatsunootoshigo",
      "type": "Normal Monster",
      "frameType": "normal",
      "desc": "A bizarre monster with the torso of a horse and a fish's tail. It gallops through the sea as if it were the wind.",
      "atk": 1350,
      "def": 1600,
      "level": 5,
      "race": "Beast",
      "attribute": "EARTH",
      "card_images": [
        {
          "id": 47922711,
          "image_url": "https://images.ygoprodeck.com/images/cards/47922711.jpg",
          "image_url_small": "https://images.ygoprodeck.com/images/cards_small/47922711.jpg",
          "image_url_cropped": "https://images.ygoprodeck.com/images/cards_cropped/47922711.jpg"
        }
      ],
      "card_prices": [
        {
          "cardmarket_price": "0.00",
          "tcgplayer_price": "0.00",
          "ebay_price": "30.00",
          "amazon_price": "4.85",
          "coolstuffinc_price": "0.00"
        }
      ]
    }
  ]
}

今回は、学習用のデータとして2020年4月1日以降のカード情報の抽出し、カードを生成してみようと思います。
https://db.ygoprodeck.com/api/v7/cardinfo.php?startdate=04/01/2020&enddate=12/31/2023

遊戯王カードの画風の抽出

まず初めに、遊戯王カードの画風の抽出をしたいと思います。世間では、Loraなどで学習させる人が多いのですが、今回は一発ネタなので、そこまで面倒なことをしません。今回の手順は、

遊戯王の画像を集める
モンスターの画像だけフィルタする
CLIPにより画像からキャプションを出力する
キャプションの情報からChatGPTにより画風を抽出してもらいます。

1.は、先ほどのYu-Gi-Oh! APIのレスポンスをよく見ると、image_url_croppedという属性があります。これは、ちょうどモンスターのイラストの部分だけ抽出することが出来ます。また、先ほどのAPIのレスポンスのtypeには"Normal Monster"などカードの種別のデータがあるので、2のようなフィルタが可能です。
今回、StableDiffusionがローカル環境で使える、Automatic1111/stable-diffusion-webuiを使っています。

これには、"interrogate CLIP"というボタンがあり、これにより画像からキャプションを生成することが出来ます。

しかし、画像1つ1つを処理していくのは、あまり効率的ではありません。stable-diffusion-webuiにはよい機能があって、APIモードというものが存在します。
stable-diffusion-webuiを起動するwebui.batに--apiというパラメータを渡すことで、APIモードが有効になります。

$ webui.bat --api

stable-diffusion-webuiのAPI層はFastAPIで作られており、localhost:7860/docsへアクセスすると、openapiのよく見るUIが表示されます。

そこに、実はPOST /sdapi/v1/interrogateというCLIPにより画像からキャプションを生成するAPIが生えています。これは公式のwikiに乗っていないので、安定して動くかどうか知りませんが、裏機能的に存在します。interrogate以外にも、いろんな機能のAPIが実は生えています。

また、シーホースのキャプションを生成してみます。

$ curl -X 'POST' \
  'http://localhost:7860/sdapi/v1/interrogate' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "image": "base64エンコードした画像データ",
  "model": "clip"
}'
{
  "caption": "a cartoon of a black horse with a blue tail and a purple tail and a purple nose and mouth, Benjamin Marra, rob rey and kentaro miura style, concept art, pop surrealism"
}

Deeplで訳すと以下のような感じでした。

青い尾と紫の尾と紫の鼻と口を持つ黒い馬の漫画、Benjamin Marra、Rob Rey、Kentaro Miuraスタイル、コンセプトアート、ポップシュールレアリスム。

まぁ間違いっぽいものはありつつも、出来てそうです。
こうすることで、大量の画像から、大量のキャプションを手に入れることが出来ます。このデータをChatGPTに食わせて、画風を抽出します。以下がプロンプトです。

"caption:"から始まる文章は、ある画家が描いた絵のキャプションを人工知能により出力したデータです。したがって、このデータには誤りがある可能性があります。キャプションの内容から、この画家の画風を理解し、この画家の絵と似た絵を描くために重要と思われるキーワードを考え、重要度が高く正確性の高いもの20個選び、出力してください。また、出力する際には、絵の中心的なモチーフを表す単語は除外し、画風を再現することを重要視して出力してください。


caption: a cartoon character is standing in front of a giant iceberg with a glowing light coming from it's mouth, Aya Goda, official art, concept art, crystal cubism
caption: a man sitting at a desk writing on a piece of paper with a green light coming from his hand, Edmond Xavier Kapp, official art, a character portrait, academic art
caption: a woman in a dress and hat with a cat tail and tail tail, standing in a room with a checkered floor, Eizan Kikukawa, official art, a character portrait, remodernism
caption: a cartoon of a woman with a dragon like body and wings on her head and arms, standing in front of a dark background, Anne Stokes, official art, concept art, sots art
caption: a cartoon elephant with a red eye and a white tusk on its head and a black tail and a red eye on its head, Amédée Ozenfant, official art, a character portrait, sumatraism
caption: a picture of a white dragon with a purple background and a purple background with a purple background and a purple background with a purple background and white dragon, Du Jin, sharp claws, a character portrait, furry art
...

以下のような出力結果を得ました。

キーワード:

cartoon character
glowing light
iceberg
Aya Goda
crystal cubism
green light
man
desk
Edmond Xavier Kapp
character portrait
academic art
woman
dress
hat
cat tail
tail tail
Eizan Kikukawa
checkered floor
remodernism
dragon body
wings
dark background
Anne Stokes
sots art

よく分からないものもありつつ、womanとmanが混在していたりしますが、とりあえず画風の抽出はこれで良しとします。個人的には、glowing lightやdark backgroundとか割とそれっぽいな。という気がしました。

画像生成

今回の素材画像はこちらです。

Psychic VR Labを会社概要からサービス・事業等まで徹底解剖！魅力に迫る！

これをStableDiffusionでimage2imageで画像を生成していきます。
　使ったモデルは、StableDiffusion 2.0で、Loopbackを4回、Denoising strengthを0.8で実行しました。基本、ガチャするだけです。
　プロンプトはこんな感じ。先ほど得られた画風のキーワードから、womanなどの不要そうな要素を取り除き、遊戯王のカードのイラストだよ！！と念押しした感じです。

Ultra detailed illustration of a Yu-Gi-Oh! card drawn by professional illustrator Kazuki Takahashi.cartoon character,glowing light,iceberg,Aya Goda,crystal cubism,green light,man,desk,Edmond Xavier Kapp,character portrait,academic art,hat,Eizan Kikukawa,remodernism,wings,dark background,Anne Stokes,sots art

出力結果はこちら。1度の生成だと、少し味気がないというか、癖が無いな。という印象があったので、Loopbackで4回重ね掛けして、癖みたいなものを出しています。

というわけで、4番目の画像を選択しました。素材画像の後ろの黒いものは3Dスキャナですが、それが生成画像では背景の柄になったりしてますね。また、最初は遊戯の髪みたいなヘアスタイルになってましたが、プロンプトにhatがあるために帽子をかぶり始めたような感じがします。

カード情報の生成

さて、ではカードの内容の生成です。今回は、ノーマルカードの情報のみを対象として、いわゆるフレーバーテキストを生成させました。本当は、画像をツッコんだら、シンクロやリンク、融合(!?)モンスターが生成されたら面白いな。と思ってはいたのですが、ChatGPTの制約上、1つのpromptで15,16枚ぐらいのデータしか入りませんでした。そうすると、2020年4月1日から売り出されたカードを対象としているため、魔界劇団のカードが多く入ってしまい、カード情報が適切な形で生成されない。という状態になりました。というわけで、今回は、ノーマルカードの情報のみを対象としました。
　今回、入力したデータは、"画像のキャプションデータ","名前","カードタイプ","テキスト","攻撃力","守備力","レベル","種族","属性"です。これらの参考値のデータをChatGPTに限界まで入れます。そして、先ほど出力した画像をCLIPにかけ、キャプションを取得します。あとは、このキャプションのデータから、先ほどの属性値をChatGPTに推論させます。以下がプロンプトです。

dataに表す値を参考に、以下のようなデータを生成してください。

caption: a man in a green hat and a green jacket with a green light on his face and a green light on his face, Cedric Seaut (Keos Masons), character portrait, a character portrait, crystal cubism
name :  
type : 
desc :
atk :
def :
level :
race :
attr :

# data
## record1
caption : a cartoon of a purple dragon flying through the air with a disc in its hand and a disc in the foreground, Baiōken Eishun, official art, a character portrait, rayonism
name : Anotherverse Dragon
type : Normal Monster
desc : ''This secretive dragon has conquered many dimensions, but can only be seen by the champion of time.''
atk : 2500
def : 2000
level : 8
race : Dragon
attr : LIGHT

## record2
caption : a computer generated image of a green and black object with a black center and two gold dots on the center, Ai-Mitsu, cybernetic, cyberpunk art, sots art
name : Brave Scizzar
type : Normal Monster
desc : Armed with a mass of scissors, this monster can clip an enemy into any desired shape and size.
atk : 1300
def : 1000
level : 4
race : Machine
attr : DARK

## record3
caption : a cartoon character with a gun in his hand and a sword in his hand, in front of a background of blue and purple water, Andor Basch, official art, concept art, rayonism
name : Clavkiys, the Magikey Skyblaster
type : Normal Monster
desc : ''Everyone is full of possibilities. Everything you do, moving forwards, stopping, closing, opening, is all in your hands.There are many doors in this world. To open them requires the Magic Key... 2 keys and 1 door... 2 wills and 1 body. Once the doors are opened, the worlds will be connected and the great power revealed.''
atk : 1900
def : 1900
level : 4
race : Warrior
attr : DARK
....

出力結果は以下の様になりました。

以下のようなデータを生成します。

caption: a man in a green hat and a green jacket with a green light on his face and a green light on his face, Cedric Seaut (Keos Masons), character portrait, a character portrait, crystal cubism
name : Green Hat Guy
type : Normal Monster
desc : ''This mysterious man has the power to control nature and wield the forces of the earth.''
atk : 1800
def : 1200
level : 4
race : Spellcaster
attr : EARTH

というわけで、あとは名前と詳細データを適当にDeepLで日本語化して、遊戯王オリカ作成オリカメーカーで画像化します。

というわけで、CHROの水谷を遊戯王カードすると、地属性の魔法使い族になるそうです。アレですね。ほぼほぼスペックはヂェミナイ・エルフですね。

全体のアーキテクチャ

今回の記事のデータや処理の流れはこんな感じ。結構色々使う必要があって大変だった。

感想

やってみた感想ですが、やっぱり画風のコピーに無理があった感じがあります。プロンプトでこだわって、遊戯王カードだよ！って指定しても、それっぽいものは出ませんでした。あとあとで調べてみるとcivitaiに遊戯王のLoraモデルがあったので、これでもよかったな。と思いました。また、学習のためにLoraの導入も考えましたが、めんどくさそうでやる気が出なかったので、webuiに組み込みであるTextual Inversionの学習もやってみました。しかし、20時間ぐらいかかる予定だったので、やめました。
　あと最近の遊戯王の流れだと思うんですが、"効果モンスターの文言が長い"という問題がありまして、そうすると、やっぱりChatGPTに大量のデータを流し込めなくて、少数のデータからカードテキストの推論になるので、結果がイマイチになりそうだなぁ。と思ってやめました。確かに、「データを入力が完了するまで、待ってください。」的なプロンプトを入れることで長文でも要約できる。というナレッジがネットには書かれているのですが、私の場合(GPT-3.5)はうまくいきませんでした。個人的には、なんかの間違いで笑えるぐらい激強効果モンスターとか出てきて欲しかったんですが。
　ChatGPTでカードの内容を生成してみたところ、スペックはそこそこ面白いのじゃないかな？と思いました。level4で攻守1800/1200のモンスターは割と妥当な性能です(むしろ強い方なぐらい)し、フレーバーテキストもそこそこイイ感じな気がしました。画像の生成も、カード情報の生成も、もう一歩ぐらい足りない感じでした。もうちょっとイイ感じになったら、普通に知り合い5,6人を勝手に遊戯王カードにして(!?)、それを入れたうえでのデッキ構築で遊んでみるのは実は面白い遊びかもしれないな。と思いました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up