LangChainのOpenAI Functions AgentとReAct Agent

Posted at 2024-06-04

LangChainでTool呼ぶAgentとしてどちらの方がいいのだろう、と気になって調べました。
OpenAI Functions Agentは、品質良いけど使えるモデルが限定(Open AIのみ？)。
ReAct Agent は品質劣るけど全(?)モデルで使える

出典

OpenAI Functions AgentはUsing OpenAI Functions Agentに書かれて通り

This is probably the most reliable type of agent, but is only compatible with function calling

一方で、ReAct AgentはUsing ReAct Agentに書かれている通り

This is a less reliable type, but is compatible with most models

OpenAI Functions Agentは、専用の訓練をしているらしく、redditにも同じことが書いてあります。

Prompt Template比較

LangSmithのPrompt Templateの比較です。

openai-functions-template

シンプルなPromptです。 Toolsの情報はPromptと別に渡します。
{instruction}は、Using OpenAI Functions Agentの例だと以下を使用。

You are an agent designed to write and execute python code to answer questions.
You have access to a python REPL, which you can use to execute python code.
If you get an error, debug your code and try again.
Only use the output of your code to answer the question.
You might know the answer without running any code, but you should still run the code to get the answer.
If it does not seem like you can write code to answer the question, just return "I don't know" as the answer.

langchain-ai/openai-functions-template

SYSTEM
{instructions}

PLACEHOLDER
chat_history

HUMAN
{input}

PLACEHOLDER
agent_scratchpad

react-agent-template

やはり、汎用的な分だけ複雑なPromptです。openai-functions-templateと違ってToolsの情報がPrompt内にあります。

langchain-ai/react-agent-template

{instructions}

TOOLS:
------

You have access to the following tools:

{tools}

To use a tool, please use the following format:
```` ```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
```` ```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

```` ```
Thought: Do I need to use a tool? No
Final Answer: [your response here]
```` ```

Begin!

Previous conversation history:
{chat_history}

New input: {input}
{agent_scratchpad}

実行時比較

本当は違いがわかりやすいものがあればいいのですが、時間をかけたくなかったので、自分が最近やったものをLangSmithで表示。

OpenAI Functions Agent

openai-functions-agentの記載どおりでChainを組みました。

全体のInputとOutput

各Step。真ん中にtavily_searchを介しています。

1度目のChatOpenAI呼出。Promptと別にToolの情報を渡しているのがわかります。

2つ目は普通の検索ですね。

2度目のChatOpenAI呼出。内容多いので小さい・・・

ReAct Agent

今度は長いです。全然Apple２Appleではないですが、ToolにWikipedia使っています。
全体のInputとOutput。

Step。とても長いです。望む情報を取得できずに何度か検索語句を変えてWikipediaで検索しています。そして、最後にあきらめて、GPT-4そのものが答えを出しています。

全ては省略しますが、いくつかのログをのせます。
1回目のChatOpenAI呼出。

２回目のChatOpenAI呼出。検索語句を変えています。この呼出でも良い回答が無かったので、GPT-4が回答出しています。

プログラムです。Chainlitで画面作っています。

import os

import chainlit as cl
from langchain import hub
from langchain.agents import AgentExecutor, create_react_agent
# from langchain.globals import set_debug
from langchain.schema.runnable.config import RunnableConfig
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
from langchain_openai import ChatOpenAI

@cl.on_chat_start
async def on_chat_start():
    wiki = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(lang='ja', features="html.parser"))
    tools = [wiki]

    prompt = hub.pull("hwchase17/react")
    model = ChatOpenAI(model_name=os.environ["OPENAI_API_MODEL"], streaming=True)
    print(model)

    # Agentを作成
    agent = create_react_agent(model, tools, prompt)
    # AgentExecutorを作成
    agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

    cl.user_session.set("agent_executor", agent_executor)


@cl.on_message
async def on_message(message: cl.Message):
    runnable = cl.user_session.get("agent_executor")

    msg = cl.Message(content="")

    async for chunk in runnable.astream(
        {"input": message.content},
        config=RunnableConfig(callbacks=[cl.LangchainCallbackHandler()]),
    ):
        # Agent Action
        if "actions" in chunk:
            for action in chunk["actions"]:
                print(f"Calling Tool: `{action.tool}` with input `{action.tool_input}`")
        # Observation
        elif "steps" in chunk:
            for step in chunk["steps"]:
                print(f"Tool Result: `{step.observation}`")
        # Final result
        elif "output" in chunk:
            print(f'Final Output: {chunk["output"]}')
            await msg.stream_token(chunk['output'])
        print("---")
    await msg.send()

関連するバージョン情報です。

project.toml

python = "^3.12"
langchain = "^0.2.1"
langchain-openai = "^0.1.8"
langsmith = "^0.1.65"
chainlit = "^1.1.202"
langchainhub = "^0.1.17"
wikipedia = "^1.4.0"
langchain-community = "^0.2.1"

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up