More than 1 year has passed since last update.

画像をBase64でエンコードしGPT APIで評価する手順

Last updated at 2024-06-04Posted at 2024-06-04

はじめに

この記事では、写真に基づいて特定の評価を行う方法について紹介します。これは、写真の内容に基づいて評価を行い、0から5の評価を返すシステムです。ここでは、OpenAIのAPIを利用して評価を自動化する方法を説明します。

課題と対応

現在、OpenAI APIに画像ファイルそのものを送信することができません。そこで、画像のエンコード（変換）した上で、評価を行うAPIに送信します。

参考にした記事

前提

OpenAIのAPIキーをすでに取得し、設定済みの状態とします
APIキーの取得の仕方がわからない方は、こちらをご覧ください：
https://platform.openai.com/api-keys

1.画像のエンコード(変換）

まず、評価対象の画像をBase64形式にエンコードします。以下のPythonコードは、指定した画像ファイルをBase64形式に変換する方法を示しています。

Base64とは

Base64とは、64進数を意味する言葉で、すべてのデータをアルファベット（az, AZ）と数字（0~9）、一部の記号（+, /）の64文字で表すエンコード方式です。

バイナリ型とテキスト型

Base64エンコードの情報自体は沢山ありますが、base64.b64encodeするだけだとバイナリ型（0と1のビットの組み合わせ）になります。このままでは画像として認識されません。decode('utf-8')することで、バイナリデータを文字列に変換することができます。

import base64

def encode_image_to_base64(file_path):
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return f"data:image/jpeg;base64,{encoded_string}"

# 画像のパスを指定
file_path = "sample-image.jpeg"
encoded_image = encode_image_to_base64(file_path)

2 OpenAI APIの設定と結果の取得

次に、OpenAIのAPIを利用して画像の評価を行います。エンコードされた画像とプロンプトを含むメッセージを作成し、OpenAIのChatCompletion APIに送信します。

evaluate_image.py

import openai
import base64

def encode_image_to_base64(file_path):
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return f"data:image/jpeg;base64,{encoded_string}"

# 画像のパスを指定
file_path = "sample-image.jpeg"
encoded_image = encode_image_to_base64(file_path)


prompt = "This is an app to rate the quality of an image based on \
          specific criteria on a five-point scale. Please provide \
          a strict evaluation based on visible factors such as \
          clarity, composition, and overall appeal."

response = openai.ChatCompletion.create(
    model="gpt-4o",
    messages=[
        {"role": "system",
         "content": "You are an AI trained to evaluate the quality \
          of images based on specific criteria. Evaluate the \
          quality of an image very strictly on a scale of 0 to 5 \ 
          from the provided image. If it is not relevant, answer \
          with 0. No explanation is needed. Answer with a single \
          digit from 0 to 5. Example: 0, 1, 2, 3, 4, 5."
         },
        {"role": "user",
         "content": [
             {"type": "text",
              "text": prompt},
             {"type": "image_url",
              "image_url": {
                  "url": encoded_image,
                  "detail": "low"  # 解像度を指定（low、high、autoのいずれか）
              }
              }
         ]
         }
    ],
    temperature=0, #回答を固定
    max_tokens=100, 
    n=1 #一つの応答を生成
)

# 応答の表示
print(response["choices"][0]["message"]["content"]) #0〜5の数字が表示

まとめ

この記事では、画像を基に評価を行うアプリケーションの実装方法を紹介しました。画像の評価基準は用途に応じてカスタマイズ可能です。ぜひ、皆さんのプロジェクトやアプリケーションにもこの方法を取り入れてみてください。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up