More than 1 year has passed since last update.

英語の問題を解説と回答を送ってくれるdiscord bot作った

Last updated at 2024-05-21Posted at 2024-05-19

OpenAIからGPT-4oが出た！！

回答の精度がかなり良くなり、画像から宿題の問題を解くことが容易になりました。
なので！写真を送ると自動で答えと回答のPDFを送るBOTを作りました。

アーキテクチャ

ライブラリ

fpdf2 openai python discord.py base64

構想図

という動作をdockerが行う

ファイル構成

main.pyの解説

コード全体　急いで書いたのですごく汚いです。

import discord # Discord.py of library
import os #　Need to import api and key

from fpdf import FPDF # Text to PDF conversion

import base64 # Convert image
from openai import OpenAI # openAIを利用するため

# Get a environment variable in .env to docker-compose.yaml
TOKEN = os.getenv("TOKEN")
OpenAIkey = os.getenv("OpenAIkey")

# Work a discord bot
class MyClient(discord.Client):
    async def on_ready(self):
        print(f'Logged on as {self.user}!')

    async def on_message(self, message):
        print(f'Message from {message.author}: {message.content}')

intents = discord.Intents.default()
intents.message_content = True

client = MyClient(intents=intents)

@client.event
async def on_message(messages):
    if messages.content == "test": # If you post messages "test"
        print("ok")

        # Identify images
        if messages.attachments:
            for attachment in messages.attachments:
                if attachment.content_type.startswith("image"):
                    # Download images
                    await attachment.save(f"./{attachment.filename}")
                    IMAGE_PATH = f"./{attachment.filename}"

        # Convert image
        def encode_image(image_path):
            with open(image_path, "rb") as image_file:
                return base64.b64encode(image_file.read()).decode("utf-8")
        
        base64_image = encode_image(IMAGE_PATH)

        # Set a model
        MODEL="gpt-4o"

        # Get a openai key
        client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", OpenAIkey))
        
        # Setting your prompt
        completion = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": "あなたは最高の高校教師です。写真の問題の全ての回答と解説をしたレポートを書いてください。"}, 
            {"role": "user", "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{base64_image}"}}
        ]}]
        )

        # Make a PDF
        pdf = FPDF()
        pdf.add_page()

        # Setting Japanese font
        font_path =r"./yugothib.ttf"
        pdf.add_font("yugothib",fname=font_path,uni=True)
        pdf.set_font("yugothib", size=12)
        
        # The text
        text = completion.choices[0].message.content
        pdf.multi_cell(0, 10, txt=text, align="L")

        # Download PDF
        pdf.output("output.pdf")

        # Post your channel
        await messages.channel.send(file=discord.File('./output.pdf', filename='aaa.pdf'))        

# Run discord bot
client.run(TOKEN)

discordからtestと送られた時にif messages.content == "test":は発火します。
また、画像ファイルが添付された場合の動作をif messages.attachments:で行っています。
base64_image = encode_image(IMAGE_PATH)でopenaiが画像ファイルを認識するようにし、その下のコードでopenAIにリクエストを送っています。
pdf = FPDF()でpdfを作成し、await messages.channel.send(file=discord.File('./output.pdf', filename='aaa')) でdiscordにPDFファイルを送信する。

困ったこと PDF変換が日本語非対応

游ゴシックのttfファイルを自分でダウンロードし読み込ませる必要がありました。
fpd2 にはもともと日本語は対応していません、好きなフォントをダウンロードしましょう。

使い方

gitからインストールします。
.envファイルにopenAIとdiscordのAPIを書いてください。
保存し以下のコマンドをdocker-compose.yamlファイルのあるディレクトリで行ってください。

docker-compose build
docker-compose up

これでbotが使えるようになります。
使い方は
test + 写真の添付
で返答が来ます。

感想

GPT-4oの画像認識が良くなったという話を聞いてすぐに作りたいと思いました。
2日間で作ったのでクオリティはまちまちですが、初めてGPTを使ったにしてはすんなりと進んでとても楽しかったです。
技術がどんどん進んでいく今、頑張って食らいついていきたいと感じました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up