OpenAI Agents SDK でデバッグしたいのでシーケンス図を書いてみた

Last updated at 2025-03-26Posted at 2025-03-25

はじめに

OpenAI Agents SDK が 2025 年 3 月 11 日にリリースされました。この記事では、公式ドキュメントを参考にして実際に動作を確認しながら、特に重要なポイントや興味深い点を深掘りしていきます。

検証に使用したライブラリのバージョンは openai-agents==0.0.6 です。

OpenAI Agents SDK の紹介

別のページでご紹介します。

デバッグプリントを仕込む

OpenAI API プラットフォームの Trace でチェックすることもできますが、パイプラインを実行するたびにブラウザを操作するのは面倒かもしれません。実行時に挙動が見えるようにデバッグを仕込んでみます。

AgentHooks と RunHooks

Agents と Runner に Hooks を設定すると実行時にイベントをフックすることができます。

src/lifecycle_example.py

import asyncio
import random
from typing import Any

from agents import (
    Agent,
    AgentHooks,
    RunContextWrapper,
    RunHooks,
    Runner,
    Tool,
    Usage,
    function_tool,
)
from pydantic import BaseModel


class CustomAgentHooks(AgentHooks):
    def __init__(self, display_name: str):
        self.event_counter = 0
        self.display_name = display_name

    async def on_start(self, context: RunContextWrapper, agent: Agent) -> None:
        self.event_counter += 1
        print(
            f"### AH {self.display_name} {self.event_counter}: "
            f"Agent {agent.name} started"
        )

    async def on_end(
        self, context: RunContextWrapper, agent: Agent, output: Any
    ) -> None:
        self.event_counter += 1
        print(
            f"### AH {self.display_name} {self.event_counter}: "
            f"Agent {agent.name} ended with output {output}"
        )

    async def on_handoff(
        self, context: RunContextWrapper, agent: Agent, source: Agent
    ) -> None:
        self.event_counter += 1
        print(
            f"### AH {self.display_name} {self.event_counter}: "
            f"Agent {source.name} handed off to {agent.name}"
        )

    async def on_tool_start(
        self, context: RunContextWrapper, agent: Agent, tool: Tool
    ) -> None:
        self.event_counter += 1
        print(
            f"### AH {self.display_name} {self.event_counter}: "
            f"Agent {agent.name} started tool {tool.name}"
        )

    async def on_tool_end(
        self, context: RunContextWrapper, agent: Agent, tool: Tool, result: str
    ) -> None:
        self.event_counter += 1
        print(
            f"### AH {self.display_name} {self.event_counter}: "
            f"Agent {agent.name} ended tool {tool.name} with result {result}"
        )


class CustomRunHooks(RunHooks):
    def __init__(self):
        self.event_counter = 0

    def _usage_to_str(self, usage: Usage) -> str:
        return (
            f"{usage.requests} requests, {usage.input_tokens} input tokens, "
            f"{usage.output_tokens} output tokens, {usage.total_tokens}"
            " total tokens"
        )

    async def on_agent_start(
        self, context: RunContextWrapper, agent: Agent
    ) -> None:
        self.event_counter += 1
        print(
            f"### RH {self.event_counter}: Agent {agent.name} started. "
            f"Usage: {self._usage_to_str(context.usage)}"
        )

    async def on_agent_end(
        self, context: RunContextWrapper, agent: Agent, output: Any
    ) -> None:
        self.event_counter += 1
        print(
            f"### RH {self.event_counter}: Agent {agent.name} ended with "
            f"output {output}. Usage: {self._usage_to_str(context.usage)}"
        )

    async def on_tool_start(
        self, context: RunContextWrapper, agent: Agent, tool: Tool
    ) -> None:
        self.event_counter += 1
        print(
            f"### RH {self.event_counter}: Tool {tool.name} started. Usage: "
            f"{self._usage_to_str(context.usage)}"
        )

    async def on_tool_end(
        self, context: RunContextWrapper, agent: Agent, tool: Tool, result: str
    ) -> None:
        self.event_counter += 1
        print(
            f"### RH {self.event_counter}: Tool {tool.name} ended with "
            f"result {result}. Usage: {self._usage_to_str(context.usage)}"
        )

    async def on_handoff(
        self, context: RunContextWrapper, from_agent: Agent, to_agent: Agent
    ) -> None:
        self.event_counter += 1
        print(
            f"### RH {self.event_counter}: Handoff from {from_agent.name} to "
            f"{to_agent.name}. Usage: {self._usage_to_str(context.usage)}"
        )


@function_tool
def random_number(max: int) -> int:
    """指定された最大値までの乱数を生成します"""
    return random.randint(0, max)


@function_tool
def multiply_by_two(x: int) -> int:
    """x を 2 倍して返す"""
    return x * 2


class FinalResult(BaseModel):
    number: int


multiply_agent = Agent(
    name="乗算エージェント",
    instructions="数値を2倍にして最終結果を返します",
    tools=[multiply_by_two],
    output_type=FinalResult,
    hooks=CustomAgentHooks(display_name="乗算 Agent"),
)

orchestration_agent = Agent(
    name="オーケストレーションエージェント",
    instructions="乱数を生成します。偶数の場合は停止します。奇数の場合は乗算エージェントに渡します。",
    tools=[random_number],
    output_type=FinalResult,
    handoffs=[multiply_agent],
    hooks=CustomAgentHooks(display_name="オーケストレーション Agent"),
)


async def main() -> None:
    user_input = input("最大数を入力してください: ")
    await Runner.run(
        orchestration_agent,
        hooks=CustomRunHooks(),
        input=f"0 から {user_input} までのランダムな整数を生成して。",
    )
    print("Done!")


if __name__ == "__main__":
    asyncio.run(main())

実行結果です。

最大数を入力してください: 10
### RH 1: Agent オーケストレーションエージェント started. Usage: 0 requests, 0 input tokens, 0 output tokens, 0 total tokens
### AH オーケストレーション Agent 1: Agent オーケストレーションエージェント started
### RH 2: Tool random_number started. Usage: 1 requests, 375 input tokens, 15 output tokens, 390 total tokens
### AH オーケストレーション Agent 2: Agent オーケストレーションエージェント started tool random_number
### RH 3: Tool random_number ended with result 3. Usage: 1 requests, 375 input tokens, 15 output tokens, 390 total tokens
### AH オーケストレーション Agent 3: Agent オーケストレーションエージェント ended tool random_number with result 3
### RH 4: Handoff from オーケストレーションエージェント to 乗算エージェント. Usage: 2 requests, 774 input tokens, 28 output tokens, 802 total tokens
### AH オーケストレーション Agent 4: Agent オーケストレーションエージェント handed off to 乗算エージェント
### RH 5: Agent 乗算エージェント started. Usage: 2 requests, 774 input tokens, 28 output tokens, 802 total tokens
### AH 乗算 Agent 1: Agent 乗算エージェント started
### RH 6: Tool multiply_by_two started. Usage: 3 requests, 774 input tokens, 28 output tokens, 802 total tokens
### AH 乗算 Agent 2: Agent 乗算エージェント started tool multiply_by_two
### RH 7: Tool multiply_by_two ended with result 6. Usage: 3 requests, 774 input tokens, 28 output tokens, 802 total tokens
### AH 乗算 Agent 3: Agent 乗算エージェント ended tool multiply_by_two with result 6
### RH 8: Agent 乗算エージェント ended with output number=6. Usage: 4 requests, 1188 input tokens, 39 output tokens, 1227 total tokens
### AH 乗算 Agent 4: Agent 乗算エージェント ended with output number=6
Done!

各アクションの開始と終了が hook できている感じです。

Mermaid でシーケンス図にしてみました。

もう少し裏側のデバッグ

詳細ログを標準出力に出す方法があります。以下のコードを追加すると、バックエンド API とのやり取りなどが出力されます。

from agents import enable_verbose_stdout_logging

enable_verbose_stdout_logging()

Python 標準パッケージの logging でログレベルが設定できます。バックエンド側の挙動が気になる場合に役立つかもしれません。

src/logging.py

from agents import Agent, Runner
from agents import enable_verbose_stdout_logging

enable_verbose_stdout_logging()
agent = Agent(
    name="アシスタント", instructions="あなたは役に立つアシスタントです"
)
result = Runner.run_sync(agent, "鉄釘の磁化について俳句を書いて")
print(result.final_output)

$ python src/logging.py
Creating trace Agent workflow with id trace_xxxxxxxxxxxxxxxxxxx
Setting current trace: trace_xxxxxxxxxxxxxxxxxxx
Creating span <agents.tracing.span_data.AgentSpanData object at 0x78ac4aee85f0> with id None
Running agent アシスタント (turn 1)
Creating span <agents.tracing.span_data.ResponseSpanData object at 0x78ac4aeed770> with id None
Calling LLM gpt-4o with input:
[
  {
    "content": "\u9244\u91d8\u306e\u78c1\u5316\u306b\u3064\u3044\u3066\u4ff3\u53e5\u3092\u66f8\u3044\u3066",
    "role": "user"
  }
]
Tools:
[]
Stream: False
Tool choice: NOT_GIVEN
Response format: NOT_GIVEN

LLM resp:
[
  {
    "id": "msg_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "content": [
      {
        "annotations": [],
        "text": "\u5fc3\u306e\u706f  \n\u9244\u91d8\u78c1\u5316\u3057  \n\u8f1d\u304d\u3092",
        "type": "output_text"
      }
    ],
    "role": "assistant",
    "status": "completed",
    "type": "message"
  }
]

Resetting current trace
心の灯  
鉄釘磁化し  
輝きを

考察

ログをよく見ると以下が気になりました。

オーケストレーションエージェントの終了が発火しない
- on_end は定義しているけどログに出ない
- Runner の on_agent_end は発火している
- 他のエージェント(乗算エージェント)の on_end はログに出る
- もしかしたら handoff したら無関係の人になるのかも
Runner 完了が hook できない
- そもそも RunHooks に on_end が無い
- Agent、Tools、Handoffs が終わったら暗黙のうちに終わりと解釈しろってことか？
- 非同期実行のことを考えると on_end があっても良いのでは？

毎回 CustumAgentHooks や CustomRunnerHooks を書くのは面倒です。Utility モジュールを作って使いまわすと良さそうです。

今回は print() で標準出力に書いていました。運用を考えると Logger とか MLflow とか CloudWatch(AWS) とかに投げる必要性が出てくると思いますが、普通に対応できそうでありがたいです。

enable_verbose_stdout_logging() は logging.DEBUG だと過剰な感じがして logging.INFO だと無口です。普通に動いている LLM でアプリケーション開発する際には出番がなさそうですが、デバッグしにくいバックエンド LLM を使う場合等に役立つかもしれません。

まとめ

いかがだったでしょうか？

OpenAI Agents SDK (v0.0.6) のデバッグ方法について調べてみましたが、今回はひとまずここまでとします。
AgentHooks と RunHooks を使えば、エージェントたちの秘密の会話（コールバック）を覗き見できることが判明しました。これで、エージェントが何をしているのか見えてきます。開発が捗ること間違いなしです。

ただ、最初のエージェントがこっそり退場してたり（on_end が呼ばれない・・・)、Runner がいつの間にか終わってたり（完了フックが定義されてない・・・）、まだまだ謎が多いのも事実。今後のアップデートで、もっとスッキリさせてほしいですね。っていうか修正してプルリクすればいいんでしょうけど・・・。

お陰様で OpenAI Agents SDK の挙動が見えてきたので、今後も調べてみようと思います。

本記事が皆さまの AI エージェントのデバッグライフの一助となれば幸いです。

最後まで読んでいただきありがとうございました！！

参照

公式サンプルコード集

一通り眺めると色々と詳しくなれそうです。

公式ドキュメント

それなりに充実しています。

公式ドキュメント Lifecycle

この記事の AgentHooks と RunHooks についてライトな感じで書いています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up