More than 1 year has passed since last update.

ChatGPTとPythonで学ぶ Gradio：データ可視化編

Last updated at 2024-01-22Posted at 2024-01-22

おはこんばんちわ。今日も引き続きChatGPT先生をお迎えして、「ChatGPTとPythonで学ぶ gradio：データ可視化編」というテーマで雑談したいと思います。それではChatGPT先生、よろしくお願いします。

assistant:

おはようございます、または、お疲れ様です。今日はgradioを使ったデータ可視化についてお話します。

gradioは、機械学習モデルのデモやプロトタイピングを簡単にするためのPythonライブラリです。このライブラリの美点は、機械学習モデルにインターフェースを作成し、簡単に共有できることにあります。

データ可視化の観点から言えば、gradioを使うと、機械学習モデルの出力を視覚的に理解しやすくするために、様々なタイプの出力（テキスト、数値、画像、グラフなど）を簡単に表示できます。

例えば、次のようにして画像分類器の結果を表示することができます。

import gradio as gr
import tensorflow as tf
import numpy as np

# Load in the model
model = tf.keras.applications.InceptionV3()

def classify_image(inp):
    inp = inp[np.newaxis, ...]
    prediction = model.predict(inp).flatten()
    return {f"class_{i}": float(prediction[i]) for i in range(1, 1001)}

image = gr.inputs.Image(shape=(299, 299, 3))
label = gr.outputs.Label(num_top_classes=3)

gr.Interface(fn=classify_image, inputs=image, outputs=label).launch()

これは画像を入力として受け取り、InceptionV3モデルを用いて1000クラスの画像分類を行い、最も確率が高い上位3クラスの予測結果を表示する例です。

このようにして、gradioを使うと、モデルの予測結果を視覚化し、デバッグや結果の解釈を容易に行うことができます。

user:

ありがとうございます。それでは、１つの例として、https://zenn.dev/yuiseki/articles/bd73f607a2acf0 に書いてあります Image Captioning のコード（下記）を Gradio で実行できるように改変していただけますか？

# モデルの読み込み
from transformers import BlipProcessor, BlipForConditionalGeneration

processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

# 画像キャプション生成する関数の定義
from PIL import Image
import requests

def generate_caption(image_url):
  raw_image = Image.open(requests.get(image_url, stream=True).raw).convert('RGB')
  w,h = raw_image.size
  display(raw_image.resize((w//15,h//15)))

  # こういうヒントを与えてあげることができる
  # 画像の種類が不明な場合は削ってください
  text = "a photography of"
  inputs = processor(raw_image, text, return_tensors="pt")
  # max_length=20が推奨されてるけど詳しく説明させてみたいので長めにしている
  out = model.generate(**inputs, min_length=20, max_length=50)
  generated_caption = processor.decode(out[0], skip_special_tokens=True)
  print(generated_caption)

assistant:

すごく良い質問です！この例のように、既存のコードをGradioで動かすためには、モデルの予測を行う関数を作り、それをGradioのインターフェースに適するように調整する必要があります。

具体的には、次のようにコードを変更する事で、Gradioで動かすことが可能です：

# モデルの読み込み
from transformers import BlipProcessor, BlipForConditionalGeneration
import gradio as gr

processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

# 画像キャプション生成する関数の定義
from PIL import Image
import requests

def generate_caption(image: gr.inputs.Image):  
  raw_image = Image.fromarray(image)
  w,h = raw_image.size
  # こういうヒントを与えてあげることができる
  # 画像の種類が不明な場合は削ってください
  text = "a photography of"
  inputs = processor(raw_image, text, return_tensors="pt")
  # max_length=20が推論されていますが詳述させてみたいので長いにしています
  out = model.generate(**inputs, min_length=20, max_length=50)
  generated_caption = processor.decode(out[0], skip_special_tokens=True)
  return generated_caption

iface = gr.Interface(fn=generate_caption, inputs=gr.inputs.Image(), outputs="text")
iface.launch()

上記のコードではgenerate_caption関数を変更しました。imageパラメータがGradioを使って使うため、そのままImage.fromarrayへ引き渡せます。

そして、Gradioのインターフェースを作成し、関数とともに指定します。inputsには gr.inputs.Image()を指定して、画像をインプットとし、outputsにはtextを指定して、テキストをアウトプットとします。

最後にiface.launch()を実行して、インターフェースを起動します。

ChromeやFirefoxのようなブラウザ上で、Gradioのインターフェースが表示され、画像をアップロードしてキャプションを生成して表示することができます。

user:

ありがとうございます。上記のコードをベースにコードを改変してください。
gradioモジュールは 2.xバージョンを使用している前提でお願いします。
gr.inputs はエラーになるので使わないでください。
launch()メソッドでshare=Trueを設定してください。
コード中に日本語での説明も可能な限り詳しくお願いします。
上記のコードをベースに、下記のエラーを修正した全コードを書いてください。

2024-01-22 14:43:00.499215: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-01-22 14:43:00.540087: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-22 14:43:00.540114: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-22 14:43:00.541546: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-22 14:43:00.549247: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-22 14:43:01.541901: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
preprocessor_config.json: 100%
445/445 [00:00<00:00, 39.2kB/s]
tokenizer_config.json: 100%
527/527 [00:00<00:00, 43.9kB/s]
vocab.txt: 100%
232k/232k [00:00<00:00, 700kB/s]
tokenizer.json: 100%
711k/711k [00:00<00:00, 4.22MB/s]
special_tokens_map.json: 100%
125/125 [00:00<00:00, 16.4kB/s]
config.json: 100%
4.60k/4.60k [00:00<00:00, 333kB/s]
model.safetensors: 100%
1.88G/1.88G [00:28<00:00, 75.8MB/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 12
      9 from PIL import Image
     10 import requests
---> 12 def generate_caption(image: gr.inputs.Image):  
     13   raw_image = Image.fromarray(image)
     14   w,h = raw_image.size

AttributeError: module 'gradio' has no attribute 'inputs'

assistant:

エラーメッセージを見ると、gradioモジュールに 'inputs'属性がない、と報告されています。このエラーは、Gradioのバージョン2.xにおいて、inputsの代わりにinterfacesを使うべきであることを示しています。

しかし、リクエストがあったように、gr.inputsの代わりにgr.Interface関数を使用することで、Gradioのインターフェースを作成することが可能です。

そのため、修正した全コードは次のようになります。

# GradioとTransformersライブラリをインポートする
from transformers import BlipProcessor, BlipForConditionalGeneration
import gradio as gr
from PIL import Image

# モデルとプロセッサをロードする
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

# イメージキャプショングを生成する関数を定義する。Gradioインターフェースのためには入力としてPILのImageインスタンスを取り、文字列を出力する必要がある。
def generate_caption(img):
    raw_image = Image.fromarray(img.astype('uint8'), 'RGB')
    # ヒントテキストを用いてプロセッサを初期化
    text = "a photography of"
    inputs = processor(raw_image, text, return_tensors="pt")
    # モデルに入力を渡して出力を生成する。max_length はどれだけ長いキャプションを生成するかを決定する。
    out = model.generate(**inputs, min_length=20, max_length=50)
    # 生成されたキャプションをデコードする
    generated_caption = processor.decode(out[0], skip_special_tokens=True)
    return generated_caption

# Gradioインターフェースを定義する。inputs は'Image'型として、出力は'Text'型とする
iface = gr.Interface(fn=generate_caption, inputs='Image', outputs='Text')
# インターフェースを起動する。share=True を用いると、Gradioは公開URLを生成し、他の人と簡単に共有できる
iface.launch(share=True)

このコードは画像キャプショニングのモデルをGradioでアクセス可能にするもので、ユーザーが画像をアップロードすると、モデルは入力画像に対するキャプションを生成し、インターフェース上に表示します。

user:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 23
     20     return generated_caption
     22 # Gradioインターフェースを定義する。inputs は'Image'型として、出力は'Text'型とする
---> 23 iface = gr.Interface(fn=generate_caption, inputs='Image', outputs='Text')
     24 # インターフェースを起動する。share=True を用いると、Gradioは公開URLを生成し、他の人と簡単に共有できる
     25 iface.launch(share=True)

File ~/.local/lib/python3.10/site-packages/gradio/interface.py:224, in Interface.__init__(self, fn, inputs, outputs, examples, cache_examples, examples_per_page, live, title, description, article, thumbnail, theme, css, allow_flagging, flagging_options, flagging_dir, flagging_callback, analytics_enabled, batch, max_batch_size, api_name, _api_mode, allow_duplication, concurrency_limit, js, head, additional_inputs, additional_inputs_accordion, **kwargs)
    218         warnings.warn(
    219             "Cache examples cannot be used with state inputs and outputs."
    220             "Setting cache_examples to False."
    221         )
    222     self.cache_examples = False
--> 224 self.main_input_components = [
    225     get_component_instance(i, unrender=True)
    226     for i in inputs  # type: ignore
    227 ]
    228 self.additional_input_components = [
    229     get_component_instance(i, unrender=True)
    230     for i in additional_inputs  # type: ignore
    231 ]
    232 if additional_inputs_accordion is None:

File ~/.local/lib/python3.10/site-packages/gradio/interface.py:225, in <listcomp>(.0)
    218         warnings.warn(
    219             "Cache examples cannot be used with state inputs and outputs."
    220             "Setting cache_examples to False."
    221         )
    222     self.cache_examples = False
    224 self.main_input_components = [
--> 225     get_component_instance(i, unrender=True)
    226     for i in inputs  # type: ignore
    227 ]
    228 self.additional_input_components = [
    229     get_component_instance(i, unrender=True)
    230     for i in additional_inputs  # type: ignore
    231 ]
    232 if additional_inputs_accordion is None:

File ~/.local/lib/python3.10/site-packages/gradio/components/base.py:350, in get_component_instance(comp, render, unrender)
    342 """
    343 Returns a component instance from a string, dict, or Component object.
    344 Parameters:
   (...)
    347     unrender: whether to unrender the component. If True, unrenders the the component (if already rendered) -- this is useful when constructing an Interface or ChatInterface inside of a Blocks. If False, does not do anything.
    348 """
    349 if isinstance(comp, str):
--> 350     component_obj = component(comp, render=render)
    351 elif isinstance(comp, dict):
    352     name = comp.pop("name")

File ~/.local/lib/python3.10/site-packages/gradio/components/base.py:332, in component(cls_name, render)
    331 def component(cls_name: str, render: bool) -> Component:
--> 332     obj = utils.component_or_layout_class(cls_name)(render=render)
    333     if isinstance(obj, BlockContext):
    334         raise ValueError(f"Invalid component: {obj.__class__}")

File ~/.local/lib/python3.10/site-packages/gradio/utils.py:461, in component_or_layout_class(cls_name)
    456     if name.lower() == cls_name.replace("_", "") and (
    457         issubclass(cls, gradio.components.Component)
    458         or issubclass(cls, gradio.blocks.BlockContext)
    459     ):
    460         return cls
--> 461 raise ValueError(f"No such component or layout: {cls_name}")

ValueError: No such component or layout: Image

assistant:

Gradio 2.x では、インターフェースの入力と出力の指定方法が変わりました。 'Image'を'gr.inputs.Image編集'、'Text' を 'gr.outputs.Textbox'に変更します。とても良い質問です。エラーレポートから、入力と出力の設定方法に問題があることがわかります。関数gr.Interfaceの引数inputsとoutputsに対する値が正しいコンポーネント名ではないと思われます。バージョン2.xのGradioでは、具体的なコンポーネント型を指定する必要があります。

修正した全コードはこちらです：

import gradio as gr
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

# モデルとプロセッサをロード
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

def generate_caption(img):
    # Gradio から受け取った画像をPIL Imageに変換
    raw_image = Image.fromarray(img)
    # プロセッサの初期化に使用するヒントテキスト
    text = "a photography of"
    inputs = processor(raw_image, text, return_tensors="pt")
    # モデルによる予測生成。max_lengthは生成するキャプションの長さを設定
    out = model.generate(**inputs, min_length=20, max_length=50)
    # 予測結果をデコードし、特殊トークンをスキップ
    generated_caption = processor.decode(out[0], skip_special_tokens=True)
    # 生成されたキャプションを返す
    return generated_caption

# Gradio インターフェースの設定
iface = gr.Interface(
    fn=generate_caption,  # 上で定義した関数を指定
    inputs=gr.inputs.Image(),  # 入力の型をImageに設定
    outputs=gr.outputs.Textbox()  # 出力の型をテキストボックスに設定
)
# インターフェスの起動（share=Trueで公開URLを生成）
iface.launch(share=True)

この修正により、ユーザがGradioインターフェースに画像を上げると、上記の関数が呼び出され、生成されたキャプションがテキストボックスに出力されます。

user:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 25
     20     return generated_caption
     22 # Gradio インターフェースの設定
     23 iface = gr.Interface(
     24     fn=generate_caption,  # 上で定義した関数を指定
---> 25     inputs=gr.inputs.Image(),  # 入力の型をImageに設定
     26     outputs=gr.outputs.Textbox()  # 出力の型をテキストボックスに設定
     27 )
     28 # インターフェスの起動（share=Trueで公開URLを生成）
     29 iface.launch(share=True)

AttributeError: module 'gradio' has no attribute 'inputs'

assistant:

上記のエラーメッセージを見ると、「gradio」には「inputs」という属性がないと出力されています。これはGradioのバージョン2.xでは 'gradio.inputs'ではなく、'gradio.interfaces.inputs'を使用するように変更されたためです。そのため、拡張子を適切に変更する必要があります。

修正後のコードは以下のとおりです。

import gradio as gr
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

# モデルとプロセッサをロード
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

def generate_caption(img):
    # Gradio から受け取った画像をPIL Imageに変換
    raw_image = Image.fromarray(img)
    # プロセッサ初期化に使用するヒントテキスト
    text = "a photography of"
    inputs = processor(raw_image, text, return_tensors="pt")
    # モデルによる予測生成。max_lengthは生成するキャプションの長さを設定
    out = model.generate(**inputs, min_length=20, max_length=50)
    # 予測結果をデコードし、特殊トークンをスキップ
    generated_caption = processor.decode(out[0], skip_special_tokens=True)
    # 生成されたキャプションを返す
    return generated_caption

# Gradio インターフェースの設定
iface = gr.Interface(
    fn=generate_caption,  # 上で定義した関数を指定
    inputs=gr.interfaces.inputs.Image(),  # 入力の型をImageに設定
    outputs=gr.interfaces.outputs.Textbox()  # 出力の型をテキストボックスに設定
)
# インターフェスの起動（share=Trueで公開URLを生成）
iface.launch(share=True)

このコードでは、ユーザーが画像をアップロードすると、上記の関数が呼び出されて、その画像に対応するキャプションが生成され、テキストボックスに出力されます。

user:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 25
     20     return generated_caption
     22 # Gradio インターフェースの設定
     23 iface = gr.Interface(
     24     fn=generate_caption,  # 上で定義した関数を指定
---> 25     inputs=gr.interfaces.inputs.Image(),  # 入力の型をImageに設定
     26     outputs=gr.interfaces.outputs.Textbox()  # 出力の型をテキストボックスに設定
     27 )
     28 # インターフェスの起動（share=Trueで公開URLを生成）
     29 iface.launch(share=True)

AttributeError: module 'gradio' has no attribute 'interfaces'

assistant:

Gradio 2.x バージョンだと入力と出力のコンポーネントは 'gr.inputs' や 'gr.outputs' ではなく、 'gr.Interface' の直接の一部となります。前述の 'gr.interfaces.inputs.Image()' と 'gr.interfaces.outputs.Textbox()' は間違っており、正しくは 'gr.Image' と 'gr.Textbox' となります。

そのため、エラーを修正してコードを改良すると以下のようになります:

import gradio as gr
from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

# モデルとプロセッサをロード
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-large")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-large")

def generate_caption(img):
    # Gradio から受け取った画像をPIL Imageに変換
    raw_image = Image.fromarray(img)
    # プロセッサ初期化に使用するヒントテキスト
    text = "a photography of"
    inputs = processor(raw_image, text, return_tensors="pt")
    # モデルによる予測生成。max_lengthは生成するキャプションの長さを設定
    out = model.generate(**inputs, min_length=20, max_length=50)
    # 予測結果をデコードし、特殊トークンをスキップ
    generated_caption = processor.decode(out[0], skip_special_tokens=True)
    # 生成されたキャプションを返す
    return generated_caption

# Gradio インターフェースの設定
iface = gr.Interface(
    fn=generate_caption,  # 上で定義した関数を指定
    inputs=gr.Image(),  # 入力の型をImageに設定
    outputs=gr.Textbox()  # 出力の型をテキストボックスに設定
)
# インターフェスの起動（share=Trueで公開URLを生成）
iface.launch(share=True)

この修正したコードを利用することで、Gradioインターフェースを通じて画像をアップロードでき、アップロードされた画像に対するキャプションがモデルによって自動生成されて表示されます。

user:

できました！ありがとうございます！画像は TogoTV の Togo picture gallery のものを使わせていただきました。

assistant:

それは良かったです！何か他に質問がありましたら、何でもお答えしますのでお気軽にどうぞ。お手伝いできてうれしいです！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up