【まとめ・訳】Zero to One: Agentic Patterns 入門 - これからAIプロダクトを作る人へ

Posted at 2026-06-07

まえがき

AIを使ったプロダクト開発について勉強をし始めた時に、「開発に入る前に、以下を読んでおくと良いよ」というものを教えてもらいました。

記事: Zero to One: Learning Agentic Patterns

以下は、記事を読んで、私なりに訳したり、まとめ直したりした内容です。

どんな人向け?

AIプロダクトを作りたい人向けです。
どんなアーキテクチャにするかわからない人にいい記事だと思います。

記事自体は2025年5年のものなので、1年ほど前になりますが、概念的な話なので今も・これからも、まだまだ使える概念かなと思います。

AIエージェントを構築する際の代表的なパターンを整理した記事のまとめです。

TL;DR

あらかじめ決められた手順を踏むものをワークフロー、LLMが判断し動的に実行内容が変わるものをエージェントと分けます
もし、エージェントではなく、ワークフローで済むなら、ワークフローを選択すべきです。そのほうがコストが安く、処理も早いからです

パターン7つ

ワークフロー系（構造的・手順が固定）
- Prompt Chaining: 順序がいつも決まっているもの
- Routing: 実行できるタスクを事前に決めておいて、LLMがどのタスクを実行するか決めるもの
- Parallelization: 複数のタスクを複数のLLMを起動して実行するもの
エージェント系（動的・自律的）
- Reflection: ループを使って、繰り返し実行することで出力をブラッシュアップするもの
- Tool Use: 外部ツールを使って、LLMが学習したデータ以上のことを実行したり取得したりするもの
- Planning (Orchestrator-Workers): LLMが計画（タスクのリスト）を作成、それぞれのタスクを複数のLLMが実行し、最後にまとめ用のLLMが出力を生成するもの
- Multi-Agent: 役割の異なるLLMが双方向にやり取りをするもの

パターンの選び方: エージェント vs ワークフロー

記事では7つのAIを使ったシステムのパターンを紹介しています。
これらのシステムはエージェントとワークフローの2種類に分けられます。

エージェント: AIアプローチを決める。動的で可変。AIによる意思決定を必要とするもの。外部ツールやメモリを使う。時間・費用がかかる事が多い
ワークフロー: あらかじめ決まった実行経路をたどる。手順が固定されているもの。一貫性が求められるもの。エージェントよりもシンプルでコストが安い

システムを構築するときは、エージェントとは最後の選択肢で、

まずは、「（そもそもAIを必要としない）スクリプト」で解決できるか
AIを必要とするが手順が明確なケースは「ワークフロー」で解決できるか
ワークフローでも解決できないような複雑・動的な意思決定を必要とする問題は「エージェント」

の順で選びます。

もしエージェントを選ぶ場合、エージェントはAIに意思決定をさせるため、予期できないエラーを出すことがあります。
そのため、ロギング・例外処理・リトライなどを用意し、システム内で自動的にエラーを修正できるような仕組みが必須になります。

また、実際のAIプロダクト開発においては、「Evaluation」を定義し、柔軟に設計を変えることが大切です。

パターンはどうやって使うのか

以下で複数のワークフロー・エージェントのパターンを7つ挙げますが、これらは単一で使うというよりも、組み合わせて使うことが多いです。

例えば、

Planningエージェントが内部でツールを使う
そのワーカーが Reflection を使う
Multi-Agent が内部で Routing を使ってタスク割当する

Workflowパターン

1. Prompt Chaining（プロンプトチェーン）

「出典: Zero to One: Learning Agentic Patterns」

ある1つのLLM呼び出しの出力を、次のLLM呼び出しの入力へ順番に渡すパターン。
タスクを、決められたステップに分解します。

段階を分けられる処理に向いています。

ユースケース

ドキュメントの作成
- 例: LLM 1: ドラフト作成 → LLM 2: バリデーション → LLM出力を生成するする3: 本文執筆
複数段階のデータの処理
- 例: LLM 1: データを抽出 -> LLM 2: データを整形 -> LLM3: データを要約
ニュースレター生成

例

ユーザーの入力をフランス語に翻訳するシステム

import os
from google import genai
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
# Step 1:　入力値を要約する
original_text = "Large language models are powerful AI systems trained on vast amounts of text data. They can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way."
prompt1 = f"Summarize the following text in one sentence: {original_text}"
 
response1 = client.models.generate_content(
    model='gemini-2.0-flash',
    contents=prompt1
)
summary = response1.text.strip()
print(f"Summary: {summary}")
 
# Step 2: 要約をフランス語に翻訳する
prompt2 = f"Translate the following summary into French, only return the translation, no other text: {summary}"

response2 = client.models.generate_content(
    model='gemini-2.0-flash',
    contents=prompt2
)
translation = response2.text.strip()
print(f"Translation: {translation}")

2. Routing / Handoff（振り分け）

「出典: Zero to One: Learning Agentic Patterns」

最初のLLMがルーターとして入力を分類するパターンです。
ルーターLLMが、最適な専門タスク／LLMへ振り分けます。

大まかな流れ

あらかじめ、パターン（ルーティング）を決めておきます
最初のLLMが、「この入力値はどのパターンに振り分けるか」という意思決定をします
パターンに沿って、タスクや別のLLMが呼ばれます

特徴

それぞれの段階で「関心の分離」ができます
各下流タスクを個別に最適化できます。それぞれ専用のプロンプトを使ったり、異なるモデルやツールを指定したりできます
個別に最適化することで、コストを抑えられることがあります

ユースケース

カスタマーサポート: 問い合わせ内容によって、請求／技術サポート／製品情報のエージェントに振り分け
階層的なLLM利用: 簡単な質問・よくある質問は安価なモデル、複雑なもの・特殊なものは高性能モデルに振り分ける
コンテンツ生成: ブログ用、SNS用、広告用のプロンプトやモデルに振り分ける

例

ユーザーの質問を、天気・科学・それ以外の質問に分けて、質問に回答するシステム

import os
import json
from google import genai
from pydantic import BaseModel
import enum
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
# ルーティング: 3種類のカテゴリを定義
class Category(enum.Enum):
    WEATHER = "weather"
    SCIENCE = "science"
    UNKNOWN = "unknown"
 
class RoutingDecision(BaseModel):
    category: Category
    reasoning: str
 
# Step 1: 入力値によってカテゴリを決める
user_query = "What's the weather like in Paris?"
# user_query = "Explain quantum physics simply."
# user_query = "What is the capital of France?"
 
prompt_router = f"""
Analyze the user query below and determine its category.
Categories:
- weather: For questions about weather conditions.
- science: For questions about science.
- unknown: If the category is unclear.
 
Query: {user_query}
"""
 
# カテゴリを決めるLLM
response_router = client.models.generate_content(
    model= 'gemini-2.0-flash-lite',
    contents=prompt_router,
    config={
        'response_mime_type': 'application/json',
        'response_schema': RoutingDecision,
    },
)
print(f"Routing Decision: Category={response_router.parsed.category}, Reasoning={response_router.parsed.reasoning}")
 
# Step 2: 1で決まったカテゴリをもとに、異なるプロンプト・モデルを呼び出す
final_response = ""
if response_router.parsed.category == Category.WEATHER:
    weather_prompt = f"Provide a brief weather forecast for the location mentioned in: '{user_query}'"
    weather_response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=weather_prompt
    )
    final_response = weather_response.text
elif response_router.parsed.category == Category.SCIENCE:
    science_response = client.models.generate_content(
        model="gemini-2.5-flash-preview-04-17",
        contents=user_query
    )
    final_response = science_response.text
else:
    unknown_response = client.models.generate_content(
        model="gemini-2.0-flash-lite",
        contents=f"The user query is: {prompt_router}, but could not be answered. Here is the reasoning: {response_router.parsed.reasoning}. Write a helpful response to the user for him to try again."
    )
    final_response = unknown_response.text
print(f"\nFinal Response: {final_response}")

3. Parallelization（並列化）

「出典: Zero to One: Learning Agentic Patterns」

タスクを「独立した複数のサブタスク」に分割し、複数LLMで同時処理して、結果を集約するLLM（Aggregator LLM）が出力を生成する

大まかな流れ

複数の実行役のLLMに対して、異なるプロンプトを送る
それぞれの実行役のLLMのレスポンスを、最後のまとめLLM（Aggregator）がまとめる

特徴

同時に処理をするのでレイテンシーが改善できる
サブタスクはそれぞれ関心の分離ができる
異なる視点の意見を集めて、品質の向上などに役立てることができる
実行するLLMとまとめるLLM（Aggregator LLM）の2種類のLLMを使う

ユースケース

クエリ分解したRAG: 複雑なクエリを複数のサブクエリに変換し、パラレルで実行した後に、出力をまとめる
- 例: 「iPhone15とPixel8 の、カメラ性能とバッテリー持ちと、価格を比較して」 -> それぞれのカメラ性能とバッテリー価格を別々に調べる -> 情報を要約して比較表にする
Map-Reduce 型の処理
- 長い文章の分割要約: 文章を章に分ける -> それぞれの章を要約する -> 全ての章の要約をまとめる
多角的な視点からの回答: 複数のLLMに前提や立場を与えるプロンプトで同じ質問をする -> まとめる
- LLM A: 「あなたはコスト削減を重視するCFOです」
- LLM B: 「あなたは現場のエンジニアです」
- LLM C: 「あなたは人事のマネージャーです」

例

ユーザーが入力したトピックをもとに、多角的視点から内容を決めるシステム（ブログ等）

import os
import asyncio
import time
from google import genai
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
async def generate_content(prompt: str) -> str:
        response = await client.aio.models.generate_content(
            model="gemini-2.0-flash",
            contents=prompt
        )
        return response.text.strip()
 
async def parallel_tasks():
    # Define Parallel Tasks
    topic = "a friendly robot exploring a jungle"
    # 異なる切り口でストーリーを考えるプロンプトを用意
    prompts = [
        f"Write a short, adventurous story idea about {topic}.",
        f"Write a short, funny story idea about {topic}.",
        f"Write a short, mysterious story idea about {topic}."
    ]
    # Step1: それぞれのプロンプトを並列で流す
    start_time = time.time()
    tasks = [generate_content(prompt) for prompt in prompts]
    results = await asyncio.gather(*tasks)
    end_time = time.time()
    print(f"Time taken: {end_time - start_time} seconds")
 
    print("\n--- Individual Results ---")
    for i, result in enumerate(results):
        print(f"Result {i+1}: {result}\n")
 
    # Step2: 全てのレスポンスをまとめて、1つの案にまとめる
    story_ideas = '\n'.join([f"Idea {i+1}: {result}" for i, result in enumerate(results)])
    aggregation_prompt = f"Combine the following three story ideas into a single, cohesive summary paragraph:{story_ideas}"
    aggregation_response = await client.aio.models.generate_content(
        model="gemini-2.5-flash-preview-04-17",
        contents=aggregation_prompt
    )
    return aggregation_response.text
    
 
result = await parallel_tasks()
print(f"\n--- Aggregated Summary ---\n{result}")

Agenticパターン

4. Reflection（リフレクション / Evaluator-Optimizer）

「出典: Zero to One: Learning Agentic Patterns」

エージェントが自身の出力を評価し、その評価を使って反復的に改善する。
「評価者 - 最適化者」「自己修正ループ」と呼ばれる。
生成 -> 評価（critique）-> 改善を、要件を満たす（か上限回数）まで繰り返す。

特徴

精度が求められるものに向いている
評価するポイントを決める必要がある

大まかな流れ

LLM 1: 出力を生成する、もしくは、タスクを完了する
LLM 2 （もしくはLLM 1だが1とは異なるプロンプト）: 評価者として、1つ目の出力を、要件に合っているか・質は高いかなどを評価する
LLM 1: ↑をフィードバックとして受け取り、出力を改善する
LLM 2が承認する（もしくは上限回数）まで、これらの処理を繰り返す

ユースケース

コード生成：コードを書く -> 実行する -> エラーメッセージやテスト結果を受ける -> 修正する
文章の推敲: ドラフトを作る、明確さ・トーンを評価する -> 修正する
計画の実現性評価: プランを作る -> 実現可能性を評価する -> 修正する
情報検索の網羅性チェック: 調べる -> 必要な情報が揃っているかを調べる -> さらに調べる

例

ユーザーが指定したトピックをもとに、詩を作るシステム

import os
import json
from google import genai
from pydantic import BaseModel
import enum

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

# 評価の結果
class EvaluationStatus(enum.Enum):
    PASS = "PASS"
    FAIL = "FAIL"

class Evaluation(BaseModel):
    evaluation: EvaluationStatus
    feedback: str
    reasoning: str
 
# 詩を作るLLM
def generate_poem(topic: str, feedback: str = None) -> str:
    prompt = f"Write a short, four-line poem about {topic}."
    if feedback:
        prompt += f"\nIncorporate this feedback: {feedback}"
    
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=prompt
    )
    poem = response.text.strip()
    print(f"Generated Poem:\n{poem}")
    return poem
 
# 詩を評価するLLM
def evaluate(poem: str) -> Evaluation:
    print("\n--- Evaluating Poem ---")
    # リズムが良いか、4行か、創作性があるかを評価
    prompt_critique = f"""Critique the following poem. Does it rhyme well? Is it exactly four lines? 
Is it creative? Respond with PASS or FAIL and provide feedback.
 
Poem:
{poem}
"""
    response_critique = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=prompt_critique,
        config={
            'response_mime_type': 'application/json',
            'response_schema': Evaluation,
        },
    )
    critique = response_critique.parsed
    print(f"Evaluation Status: {critique.evaluation}")
    print(f"Evaluation Feedback: {critique.feedback}")
    return critique
 
# ループ 
max_iterations = 3
current_iteration = 0
topic = "a robot learning to paint"

# Step1: 最初の詩
current_poem = "With circuits humming, cold and bright,\nA metal hand now holds a brush"
 
while current_iteration < max_iterations:
    current_iteration += 1
    print(f"\n--- Iteration {current_iteration} ---")
    # Step2-1: 評価する
    evaluation_result = evaluate(current_poem)
 
    if evaluation_result.evaluation == EvaluationStatus.PASS:
        print("\nFinal Poem:")
        print(current_poem)
        break
    else:
        # Step2-2: フィードバックをもとに詩を修正する
        current_poem = generate_poem(topic, feedback=evaluation_result.feedback)
        if current_iteration == max_iterations:
            print("\nMax iterations reached. Last attempt:")
            print(current_poem)

5. Tool Use（ツール利用 / Function Calling）

「出典: Zero to One: Learning Agentic Patterns」

LLMが「外部ツールを実行できる」という特徴を使って、アクションを起こしたり、情報を収集したりする

特徴

最もよくあるエージェントのパターン
LLMは学習した内容以上のことができるようになる

大まかな流れ

あらかじめ、LLMに与えるツール定義（名前・説明・入力スキーマ）を決めておく
LLMが入力値に応じてどのツール（複数も可）を選ぶかを決める。ツールに定義されているスキーマを返す
↑を使ってツールを実行する。LLMに結果を返す
↑を使って、LLMがユーザーにレスポンスを返す

ユースケース

カレンダーAPIを使ってスケジュール予約
ファイナンスのAPIを使ってリアルタイム株価取得
RAG: ドキュメントのベクトルDB検索
スマートホーム制御
コード実行

例

現在の特定の都市の気温を調べて回答するシステム

import os
from google import genai
from google.genai import types
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
# ツールを定義する
weather_function = {
    "name": "get_current_temperature",
    "description": "Gets the current temperature for a given location.",
    # スキーマ定義
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city name, e.g. San Francisco",
            },
        },
        "required": ["location"],
    },
}
 
# APIコールのMock
def get_current_temperature(location: str) -> dict:
    return {"temperature": "15", "unit": "Celsius"}
 
# ツールをLLMに登録してプロンプトをLLMに渡す
tools = types.Tool(function_declarations=[weather_function])
contents = ["What's the temperature in London right now?"]
response = client.models.generate_content(
    model='gemini-2.0-flash',
    contents=contents,
    config = types.GenerateContentConfig(tools=[tools])
)
 
# Process the Response (Check for Function Call)
response_part = response.candidates[0].content.parts[0]
if response_part.function_call:
    # LLMがツールを呼んだ時
    function_call = response_part.function_call
    print(f"Function to call: {function_call.name}")
    print(f"Arguments: {dict(function_call.args)}")
 
    if function_call.name == "get_current_temperature":   
        # ツールを実行
        api_result = get_current_temperature(*function_call.args)
        # ツールの実行結果をプロンプトにする
        follow_up_contents = [
            types.Part(function_call=function_call),
            types.Part.from_function_response(
                name="get_current_temperature",
                response=api_result
            )
        ]
        # 最終出力をLLMが生成する
        response_final = client.models.generate_content(
            model="gemini-2.0-flash",
            contents=contents + follow_up_contents,
            config=types.GenerateContentConfig(tools=[tools])
        )
        print(response_final.text)
    else:
        print(f"Error: Unknown function call requested: {function_call.name}")
else:
    print("No function call found in the response.")
    print(response.text)

6. Planning（Orchestrator-Workers）プランニング（オーケストレーション）

「出典: Zero to One: Learning Agentic Patterns」

中心のLLMとなるプランナーLLMが複雑なタスクを動的なサブタスクに分解し、それぞれのタスクを専門のエージェントに実行させるパターン。

さらにその結果を、Orchestrator/Synthesizer LLMが集約し、目標達成を判断して出力もしくは再計画をプランナーLLMに依頼する。

特徴

複数の処理が必要になる場合に、最初に「計画」を立てることで解決するパターン
1つのPlanner （プランナー）・複数のWorker（ワーカー）・1つのOrchestrator/Synthesizer（オーケストレーター/シンセサイザー）の3種類のLLMを使う
「計画」はユーザーの入力値によって、動的に変化する

大まかな流れ

プランナーLLMがどのタスクを実行するかという計画を立てる
それぞれのワーカーLLMがそれぞれのタスクを実行する
オーケストレーター/シンセサイザーLLMがワーカーの実行結果を取りまとめて、終了とするのか、再度計画からやり直すのかを決める

Routingとの違い

ルーターは「単一の次ステップ」を選ぶのに対し、プランナーは「複数ステップ」の計画を生成する

ユースケース

複雑なソフトウェア開発: 「XXという機能を作って」-> 計画 -> 実装 -> テスト・ドキュメント化
リサーチ＆レポート生成:　調査 -> データ抽出 -> 分析 -> 執筆
マルチモーダルタスク: 画像生成・テキスト分析・データ統合
複合的なユーザーのリクエスト: 「3日間のパリの旅行計画を立てて」-> 予算に合わせて飛行機・ホテル予約

例

タスクを以下のように定義する

import os
from google import genai
from pydantic import BaseModel, Field
from typing import List
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
# タスク定義
class Task(BaseModel):
    task_id: int
    description: str
    assigned_to: str = Field(description="Which worker type should handle this? E.g., Researcher, Writer, Coder")

# 計画定義
class Plan(BaseModel):
    goal: str
    steps: List[Task]
 
# Step 1: プランナーLLM: どのタスクを実行するか、計画を作成する
user_goal = "Write a short blog post about the benefits of AI agents."
 
prompt_planner = f"""
Create a step-by-step plan to achieve the following goal. 
Assign each step to a hypothetical worker type (Researcher, Writer).
 
Goal: {user_goal}
"""
 
print(f"Goal: {user_goal}")
print("Generating plan...")
 
response_plan = client.models.generate_content(
    model='gemini-2.5-pro-preview-03-25',
    contents=prompt_planner,
    config={
        'response_mime_type': 'application/json',
        'response_schema': Plan,
    },
)
 
# Step 2~: ワーカーLLM・オーケストレーションLLM: 計画を実行・結果を見る
for step in response_plan.parsed.steps:
    print(f"Step {step.task_id}: {step.description} (Assignee: {step.assigned_to})")

7. Multi-Agent（マルチエージェント）

「出典: Zero to One: Learning Agentic Patterns」

複数の異なる「役割」を持つエージェントを組み合わせて、それらを双方向にやり取りをしたり、共同で作業をしたりするパターン。

特徴

それぞれのエージェントが異なる役割・知識・ツールを持つ

大まかな流れ

2つのパターンがある

Coordinator / Manager アプローチ: コーディネータ役のエージェント(例:PM)をおいて、そのエージェントを起点に他のエージェントと情報のやり取りをするパターン
Swarm (群れ) アプローチ: エージェント（例: リサーチャー）が、他のエージェントに連携・引き継ぎ（例: ライター）をするパターン

ユースケース

AIペルソナによる討論・ブレスト
「アプリの新機能を考えて」
- デザイナー
- エンジニア
- マーケター
複雑なソフトウェア開発
- 計画
- 実装
- テスト
- デプロイ
仮想実験・シミュレーション
- 「買い手」エージェントと「売り手」エージェントを多数立てる -> 市場価格がどう落ち着くか見る
共同執筆
- リサーチャー
- ライター
- 編集者
- 校正役

例（Swarmアプローチ）

ホテル・レストランの予約システム。ユーザーがレストランの予約を依頼したときに、レストランを予約する

from google import genai
from pydantic import BaseModel, Field
 
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
# 出力スキーマ
class Response(BaseModel):
    # 引き継ぎする先のエージェント
    handoff: str = Field(default="", description="The name/role of the agent to hand off to. Available agents: 'Restaurant Agent', 'Hotel Agent'")
    message: str = Field(description="The response message to the user or context for the next agent")
 
# 実行
def run_agent(agent_name: str, system_prompt: str, prompt: str) -> Response:
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents=prompt,
        config = {'system_instruction': f'You are {agent_name}. {system_prompt}', 'response_mime_type': 'application/json', 'response_schema': Response}
    )
    return response.parsed
 
# エージェントごとに異なるシステムプロンプトを定義
hotel_system_prompt = "You are a Hotel Booking Agent. You ONLY handle hotel bookings. If the user asks about restaurants, flights, or anything else, respond with a short handoff message containing the original request and set the 'handoff' field to 'Restaurant Agent'. Otherwise, handle the hotel request and leave 'handoff' empty."
restaurant_system_prompt = "You are a Restaurant Booking Agent. You handle restaurant recommendations and bookings based on the user's request provided in the prompt."
 
# ユーザー入力
initial_prompt = "Can you book me a table at an Italian restaurant for 2 people tonight?"
print(f"Initial User Request: {initial_prompt}")
 
# Step1: ホテルエージェントを実行する
output = run_agent("Hotel Agent", hotel_system_prompt, initial_prompt)
 
# Step2: ホテルエージェントがレストランエージェント・ホテルエージェントのどちらを指定したかによって引き継ぎ先を変える
if output.handoff == "Restaurant Agent":
    print("Handoff Triggered: Hotel to Restaurant")
    output = run_agent("Restaurant Agent", restaurant_system_prompt, initial_prompt)
elif output.handoff == "Hotel Agent":
    print("Handoff Triggered: Restaurant to Hotel")
    output = run_agent("Hotel Agent", hotel_system_prompt, initial_prompt)
 
print(output.message)

最後に

体系的にまとめられていたので、割と読みやすかったです。
実際のプロダクトを考えてみても、1つのパターンを使っているというよりは、複数のパターンを使っているなと思いました。
また実装に迷ったら、戻ってこようと思うような記事でした。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up