AIエージェントを自作する①

AIエージェント

Posted at 2025-12-27

はじめに

AIエージェントを勉強するために簡単なものを自作していきます。

まずはチャットの様子を御覧ください。

uv run python3 main.py
AI Agent CLI (type 'exit' to quit)
you> こんにちは
ai> こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。

{"turn": 1, "timestamp": 1766789150.2206059, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "こんにちは"}, {"role": "assistant", "content": "こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。"}]}
you> 123*234=?
ai> 123×234 = 28,782
（123×200 = 24,600；123×34 = 4,182）

{"turn": 2, "timestamp": 1766789167.461524, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "こんにちは"}, {"role": "assistant", "content": "こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。"}, {"role": "user", "content": "123*234=?"}, {"role": "assistant", "content": "123×234 = 28,782  \n（123×200 = 24,600；123×34 = 4,182）"}]}
you> 1234*3456=?
ai> 1234×3456 = 4,264,704

{"turn": 3, "timestamp": 1766789209.749444, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "こんにちは"}, {"role": "assistant", "content": "こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。"}, {"role": "user", "content": "123*234=?"}, {"role": "assistant", "content": "123×234 = 28,782  \n（123×200 = 24,600；123×34 = 4,182）"}, {"role": "user", "content": "1234*3456=?"}, {"role": "assistant", "content": "", "tool_calls": [{"id": "call_i9S7Fd44GbVblqeXL9sMrqVV", "type": "function", "function": {"name": "calculator", "arguments": "{\"expression\":\"1234*3456\"}"}}]}, {"role": "tool", "content": "4264704", "tool_call_id": "call_i9S7Fd44GbVblqeXL9sMrqVV", "name": "calculator"}, {"role": "assistant", "content": "1234×3456 = 4,264,704"}]}

以下のような仕様です。

ローカルでPythonスクリプトとして実行
you, aiのようにどちらのターンなのかを表示
学習用にjson形式でログを標準出力
- turn, timestamp, mode, messages
function calling対応
- 電卓
- 時刻取得
LangChain, LangGraphなどのライブラリは使わない
- 学習が目的のため、できるだけLLMを生に近い形で触りたいため
モデルはOpenRouter経由でGPT-5 miniを利用

工夫・面白い点

messagesの積み上げ

LLMはステートレスなので相手の直前までのやり取りが文脈として扱うことでマルチターンの会話を実現する必要があります。ログを見るとわかるのですがmessagesがターンごとに積み上がっていることがわかります。

{"turn": 1, "timestamp": 1766789150.2206059, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "こんにちは"}, {"role": "assistant", "content": "こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。"}]}

{"turn": 2, "timestamp": 1766789167.461524, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "こんにちは"}, {"role": "assistant", "content": "こんにちは！ご用件は何でしょうか？質問や相談、何でもどうぞ。"}, {"role": "user", "content": "123*234=?"}, {"role": "assistant", "content": "123×234 = 28,782  \n（123×200 = 24,600；123×34 = 4,182）"}]}

function calling

LLMは言ってしまえば言語モデルなので、特定の決定的なタスク（たとえば四則演算）を解かせることは向いていません（ある程度は解けるものの時々間違えてしまったりしますし、計算効率がよくない）。そこでfunction callingという考え方があり、そういった特定のタスクは関数として実装してLLMにはそれを呼び出しさせるようにします。

電卓

turn: 3において、userからの入力に対して

{"role": "user", "content": "1234*3456=?"}

assistant(LLM)は自分で計算せずにfunction callしています

{"role": "assistant", "content": "", "tool_calls": [{"id": "call_i9S7Fd44GbVblqeXL9sMrqVV", "type": "function", "function": {"name": "calculator", "arguments": "{\"expression\":\"1234*3456\"}"}}]}

そしてtoolから返答があり

{"role": "tool", "content": "4264704", "tool_call_id": "call_i9S7Fd44GbVblqeXL9sMrqVV", "name": "calculator"}

答えを作文してuserに返しています。

{"role": "assistant", "content": "1234×3456 = 4,264,704"}]}

ちなみにturn: 2においては、function callingをせずに自前で計算したようです。

時刻取得

こちらも面白かった点が一つあります。今回用意した関数get_current_time()は、タイムゾーンを引数にとることができずUTCを返します。LLMはこの関数を実行したあと、日本時間へ変換してくれています。プロンプトの通りuserは「今の時刻は？」と聞いただけですが、日本語で聞いたので日本時間に変換してくれたのでしょう。

ログを取りそこねましたが、引数にタイムゾーンを渡そうとしてエラーになっているケースもありました。この辺りは改善の余地がありそうです。

you> 今の時刻は？
ai> 現在の時刻は協定世界時(UTC)で 2025-12-26 23:14:20 です。
日本時間(JST, UTC+9)では 2025-12-27 08:14:20 です。別のタイムゾーンで知りたいですか？

{"turn": 1, "timestamp": 1766790867.4332318, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "今の時刻は？"}, {"role": "assistant", "content": "", "tool_calls": [{"id": "call_dAz3dNI99jVQl9PBdvhRhOTv", "type": "function", "function": {"name": "get_current_time", "arguments": "{}"}}]}, {"role": "tool", "content": "2025-12-26T23:14:20.459735+00:00", "tool_call_id": "call_dAz3dNI99jVQl9PBdvhRhOTv", "name": "get_current_time"}, {"role": "assistant", "content": "現在の時刻は協定世界時(UTC)で 2025-12-26 23:14:20 です。  \n日本時間(JST, UTC+9)では 2025-12-27 08:14:20 です。別のタイムゾーンで知りたいですか？"}]}

you> what time is it now?
ai> It's 2025-12-26 23:18:43 UTC. Want this converted to your local time or a 12-hour format?

{"turn": 1, "timestamp": 1766791126.690485, "model": "openai/gpt-5-mini", "messages": [{"role": "system", "content": "You are a concise and helpful assistant. Think step-by-step when needed."}, {"role": "user", "content": "what time is it now?"}, {"role": "assistant", "content": "", "tool_calls": [{"id": "call_K70FwWrNiCxqWIIPnVmgv9fO", "type": "function", "function": {"name": "get_current_time", "arguments": "{}"}}]}, {"role": "tool", "content": "2025-12-26T23:18:43.629840+00:00", "tool_call_id": "call_K70FwWrNiCxqWIIPnVmgv9fO", "name": "get_current_time"}, {"role": "assistant", "content": "It's 2025-12-26 23:18:43 UTC. Want this converted to your local time or a 12-hour format?"}]}

toolの一覧をLLMはどう知るのか？

さらにログを追加してtoolsをどのようにAPIにリクエストしているかを確認できます。本来はこれがLLMに生のテキストとしてどうわたるのかを確認したいですが、APIを叩いている都合上そこまではわかりません。

{
  "model": "openai/gpt-5-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise and helpful assistant. Think step-by-step when needed."
    },
    {
      "role": "user",
      "content": "こんにちは"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "calculator",
        "description": "Evaluate a basic arithmetic expression (supports +, -, *, /, %, **, parentheses).",
        "parameters": {
          "type": "object",
          "properties": {
            "expression": {
              "type": "string",
              "description": "Arithmetic expression to evaluate"
            }
          },
          "required": [
            "expression"
          ]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_current_time",
        "description": "Return the current time in ISO 8601 (UTC).",
        "parameters": {
          "type": "object",
          "properties": {}
        }
      }
    }
  ],
  "temperature": 0.4
}

コード

import ast
import json
import operator
import os
import sys
import time
import logging
from typing import Any, Dict
from datetime import datetime, timezone

from openai import OpenAI


SYSTEM_PROMPT = "You are a concise and helpful assistant. Think step-by-step when needed."
OPENROUTER_API_BASE = "https://openrouter.ai/api/v1"
DEFAULT_MODEL = "openai/gpt-5-mini"

DEBUG_DUMP_PAYLOAD = os.getenv("DEBUG_DUMP_PAYLOAD", "").lower() in {
    "1", "true", "yes"}
DEBUG_HTTP = os.getenv("DEBUG_HTTP", "").lower() in {"1", "true", "yes"}

OPS = {
    ast.Add: operator.add,
    ast.Sub: operator.sub,
    ast.Mult: operator.mul,
    ast.Div: operator.truediv,
    ast.Pow: operator.pow,
    ast.Mod: operator.mod,
    ast.USub: operator.neg,
    ast.UAdd: operator.pos,
}


def get_client() -> OpenAI:
    api_key = os.getenv("OPENROUTER_API_KEY")
    if not api_key:
        raise RuntimeError("Set OPENROUTER_API_KEY environment variable.")
    return OpenAI(api_key=api_key, base_url=OPENROUTER_API_BASE)


def maybe_enable_http_debug() -> None:
    # The OpenAI Python SDK uses httpx under the hood.
    # This enables fairly verbose request/response logging (headers + URL).
    if not DEBUG_HTTP:
        return
    logging.basicConfig(level=logging.DEBUG)
    logging.getLogger("httpx").setLevel(logging.DEBUG)


def dump_payload(label: str, payload: Dict[str, Any]) -> None:
    if not DEBUG_DUMP_PAYLOAD:
        return
    print(f"\n--- {label} ---")
    print(json.dumps(payload, ensure_ascii=False, indent=2))
    print("--- end ---\n")


def eval_expr(expr: str) -> float:
    node = ast.parse(expr, mode="eval")

    def _eval(n):
        if isinstance(n, ast.Expression):
            return _eval(n.body)
        if isinstance(n, ast.Num):  # type: ignore[attr-defined]
            return n.n
        if isinstance(n, ast.UnaryOp) and type(n.op) in OPS:
            return OPS[type(n.op)](_eval(n.operand))
        if isinstance(n, ast.BinOp) and type(n.op) in OPS:
            return OPS[type(n.op)](_eval(n.left), _eval(n.right))
        raise ValueError("Unsupported expression")

    return _eval(node)


TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "calculator",
            "description": "Evaluate a basic arithmetic expression (supports +, -, *, /, %, **, parentheses).",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Arithmetic expression to evaluate",
                    }
                },
                "required": ["expression"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "Return the current time in ISO 8601 (UTC).",
            "parameters": {
                "type": "object",
                "properties": {},
            },
        },
    },
]


def chat_loop() -> None:
    client = get_client()
    maybe_enable_http_debug()
    turn = 0
    history = [
        {
            "role": "system",
            "content": SYSTEM_PROMPT,
        }
    ]

    print("AI Agent CLI (type 'exit' to quit)")
    while True:
        try:
            user_input = input("you> ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nbye")
            return

        if not user_input:
            continue
        if user_input.lower() in {"exit", "quit"}:
            print("bye")
            return

        turn += 1
        history.append({"role": "user", "content": user_input})
        messages_payload = list(history)

        try:
            req = {
                "model": DEFAULT_MODEL,
                "messages": messages_payload,
                "tools": TOOLS,
                "temperature": 0.4,
            }
            dump_payload("REQUEST (before first call)", req)
            response = client.chat.completions.create(**req)
        except Exception as exc:  # pragma: no cover - runtime safety
            print(f"error: {exc}")
            history.pop()  # drop failed turn
            continue

        choice = response.choices[0]
        message = choice.message

        if message.tool_calls:
            history.append(
                {
                    "role": "assistant",
                    "content": message.content,
                    "tool_calls": [
                        {
                            "id": tc.id,
                            "type": tc.type,
                            "function": {
                                "name": tc.function.name,
                                "arguments": tc.function.arguments,
                            },
                        }
                        for tc in message.tool_calls
                    ],
                }
            )
            for tool_call in message.tool_calls:
                name = tool_call.function.name
                try:
                    args = json.loads(tool_call.function.arguments or "{}")
                except json.JSONDecodeError:
                    args = {}

                if name == "calculator":
                    try:
                        result = eval_expr(args["expression"])
                        tool_content = str(result)
                    except Exception as exc:  # pragma: no cover - runtime safety
                        tool_content = f"error: {exc}"
                elif name == "get_current_time":
                    now = datetime.now(timezone.utc).isoformat()
                    tool_content = now
                else:
                    tool_content = f"error: unsupported tool {name}"

                history.append(
                    {
                        "role": "tool",
                        "content": tool_content,
                        "tool_call_id": tool_call.id,
                        "name": name,
                    }
                )

            messages_payload = list(history)
            try:
                req = {
                    "model": DEFAULT_MODEL,
                    "messages": messages_payload,
                    "tools": TOOLS,
                    "temperature": 0.4,
                }
                dump_payload("REQUEST (after tool results)", req)
                response = client.chat.completions.create(**req)
            except Exception as exc:  # pragma: no cover - runtime safety
                print(f"error: {exc}")
                history.pop()  # drop failed turn
                continue
            choice = response.choices[0]
            message = choice.message

        final_content = message.content or ""
        history.append({"role": "assistant", "content": final_content})
        print(f"ai> {final_content}\n")

        messages_payload = list(history)
        log_entry = {
            "turn": turn,
            "timestamp": time.time(),
            "model": DEFAULT_MODEL,
            "messages": messages_payload,
        }
        print(json.dumps(log_entry, ensure_ascii=False, indent=2))


def main() -> None:
    try:
        chat_loop()
    except Exception as exc:  # pragma: no cover - runtime safety
        print(f"fatal: {exc}", file=sys.stderr)
        sys.exit(1)


if __name__ == "__main__":
    main()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up