Reasoning Model
ReAct
Thought → Action → Observation のサイクルを繰り返し、エージェントが推論・行動・適応を行うフレームワーク。
-
Thought
ユーザーのクエリと過去のObservationをもとに、タスクを分割・計画する。
必要に応じてツールを特定し、呼び出し形式(JSONやTool calling)で出力する。 -
Action
Thoughtの出力に基づき、外部のツール(APIなど)を実行する。 -
Observation
Actionの実行結果(APIレスポンス、エラーメッセージなど)をObservationとして取得し、プロンプトに追記してLLMに再度入力する。
Observationに基づいて、再度Thoughtを行い、必要であれば次のActionへ進むか、最終回答を出力する。
具体例:"What's the weather in London?"という質問への応答プロセス
System Prompt
# This system prompt is a bit more complex and actually contains the function description already appended.
# Here we suppose that the textual description of the tools has already been appended.
SYSTEM_PROMPT = """Answer the following questions as best you can. You have access to the following tools:
get_weather: Get the current weather in a given location
The way you use the tools is by specifying a json blob.
Specifically, this json should have an `action` key (with the name of the tool to use) and an `action_input` key (with the input to the tool going here).
The only values that should be in the "action" field are:
get_weather: Get the current weather in a given location, args: {"location": {"type": "string"}}
example use :
{{
"action": "get_weather",
"action_input": {"location": "New York"}
}}
ALWAYS use the following format:
Question: the input question you must answer
Thought: you should always think about one action to take. Only one action at a time in this format:
Action:
$JSON_BLOB (inside markdown cell)
Observation: the result of the action. This Observation is unique, complete, and the source of truth.
(this Thought/Action/Observation can repeat N times, you should take several steps when needed. The $JSON_BLOB must be formatted as markdown and only use a SINGLE action at a time.)
You must always end your output with the following format:
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Now begin! Reminder to ALWAYS use the exact characters `Final Answer:` when you provide a definitive answer. """
- Prompt(最初の入力)
User: What's the weather in London?
- Thought(LLMの出力)
Thought: To answer the question, I need to get the current weather in London.
Action:
{
"action": "get_weather",
"action_input": {"location": "London"}
}
この際に、observatoinのところで出力を停止する。(止めないと勝手にLLMが出力してしまう。)
- Action(外部ツールを実行)
def get_weather(location):
return f"The weather in {location} is sunny with low temperatures."
observation = get_weather("London")
# → "The weather in London is sunny with low temperatures."
- Observation(プロンプトに追記)
Observation: The weather in London is sunny with low temperatures.
- 再度Thought → Final Answer
Thought: I now know the final answer.
Final Answer: The weather in London is sunny with low temperatures.
CoTとの違い
CoTは一回のレスポンスの中で、思考の過程を出力しながら最後の出力するように促すプロンプトテクニック。ReActは、toolの呼び出しなどを含んだAgenticなシステム設計フレームワーク
GPT-o1モデルなどの推論モデル
これらはプロンプトによって推論をさせるのではなく、推論をするように学習されたモデル
出力にダグを含めて、最初に推論をするように学習されている。