More than 1 year has passed since last update.

OpenAI Function callingの関数定義に対する実験、考察

Last updated at 2023-10-15Posted at 2023-10-15

はじめに

OpenAI Function callingが2023年6月に利用可能になったばかりの機能です。

機能的にはLangChainのAgentが提供していたものと同じと言って良いと思いますが、LangChainのAgentはLLMが返却するPromptに構造を強制することを前提としています。

入力するPromptが短い分には特に問題なく指定した構造で返却してくれるのですが、長くなり複雑化するにつれてLLM(特にGPT3.5)は容易に壊れたレスポンスをかえしてきてしまうという問題があるように思えます。

OpenAI Function callingはPromptに構造を求めるものではなく、OpenAI側が組み込みで提供してくれる関数呼び出しの機構なので、より良い選択肢になるのではと思います。¹

OpenAI Function callingの関数定義（ここでいう関数定義とはAPIに渡すfunctionsパラメータを指します）がどのようにLLMの動作に影響するのか気になっており、軽く実験してみたのでその考察も兼ねての共有というかなーりゆるい内容になります。

なお、OpenAI Function callingの基礎知識は既知とします。馴染みのない方はhttps://cookbook.openai.com/examples/how_to_call_functions_with_chat_models を一読してください。

関数定義の実験

関数定義がどのようにLLMの動作に影響するのか？
と聞かれた時、LLMがユーザへ応答するために妥当な関数があるか探し、場合によっては関数を呼ぶことを選択しそのための引数を決定するためと答えると思います。

間違いなく正しいのですが、よくよく考えると、過去の関数呼び出しの結果をLLMは参照し、返答を決定しているはずであり、その際に関数定義を参照しているのではないかと思われます。

というのも、messages からLLMが把握できる過去の関数呼び出しの情報は

どのような関数名か
引数は何か
返り値は何か

しかないのです。

というわけで、推測が正しいか実験してみましょう。

以下は東京の過去の天気を返却する関数である get_weather1、サンフランシスコの過去の天気を返却する関数である get_weather2を用意し、2023/10/1の東京とサンフランシスコの天気を尋ねるコードです。

import openai
import requests
from tenacity import retry, wait_random_exponential, stop_after_attempt
from termcolor import colored

GPT_MODEL = "gpt-3.5-turbo-0613"

# copy from https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }
    
    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "function":
            print(colored(f"function ({message['name']}): {message['content']}\n", role_to_color[message["role"]]))

# copy from https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models
@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, functions=None, function_call=None, model=GPT_MODEL):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer " + openai.api_key,
    }
    json_data = {"model": model, "messages": messages}
    if functions is not None:
        json_data.update({"functions": functions})
    if function_call is not None:
        json_data.update({"function_call": function_call})
    try:
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=json_data,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

functions = [
    {
        "name": "get_weather1",
        "description": "Get the weather of a past day in Tokyo",
        "parameters": {
            "type": "object",
            "properties": {
                "date": {
                    "type": "string",
                    "description": "Date",
                },
            },
            "required": ["date"],
        },
    },
    {
        "name": "get_weather2",
        "description": "Get the weather of a past day in San Francisco",
        "parameters": {
            "type": "object",
            "properties": {
                "date": {
                    "type": "string",
                    "description": "Date",
                },
            },
            "required": ["date"],
        },
    },
]

def call(func_name: str) -> str:
    if func_name == "get_weather1":
        return "sunny"
    elif func_name == "get_weather2":
        return "rainy"

messages = []
messages.append({"role": "system", "content": "You are an assistant."})
messages.append({"role": "user", "content": "What was the weather in Tokyo and San Francisco on 2023/10/1."})
chat_response = chat_completion_request(
    messages, functions=functions
)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)
# Sunny in TYO
messages.append({"role": "function", "name": assistant_message["function_call"]["name"], "content": call(assistant_message["function_call"]["name"])})

chat_response = chat_completion_request(
    messages, functions=functions
)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)

# Rainy in SFO
messages.append({"role": "function", "name": assistant_message["function_call"]["name"], "content": call(assistant_message["function_call"]["name"])})

chat_response = chat_completion_request(
    messages,
    # Change this value later
    functions=functions
)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)

pretty_print_conversation(messages)

結果のスクリーンショットが以下です。
東京の天気をまず確認し、晴れであると把握し、サンフランシスコの天気を次に確認し、雨であると把握し、最終的に正しく、「東京は晴れでサンフランシスコは雨である」と返答しています。

ここで、東京とサンフランシスコの定義を逆にしたfunctions_revを用意します。

functions_rev = [
    {
        "name": "get_weather2",
        "description": "Get the weather of a past day in Tokyo",
        "parameters": {
            "type": "object",
            "properties": {
                "date": {
                    "type": "string",
                    "description": "Date",
                },
            },
            "required": ["date"],
        },
    },
    {
        "name": "get_weather1",
        "description": "Get the weather of a past day in San Francisco",
        "parameters": {
            "type": "object",
            "properties": {
                "date": {
                    "type": "string",
                    "description": "Date",
                },
            },
            "required": ["date"],
        },
    },
]

一番最後のLLMへのリクエストのみ # Change this value laterとコメントしてある下の行を functions=functions_revに変更して、こちらの定義に差し替え、それ以前の2つのリクエストはfunctionsのままにするとどうなるでしょう。

LLMに「東京は雨でサンフランシスコは晴れである」と勘違いさせることに成功しました。
このことから、過去の関数呼び出しの結果が意味するのを理解するために関数定義を参照していることは間違いないと言えます。

関数定義のその他実験

最後のLLMに対してのリクエストにおいて、functions指定を省いた場合、どうなるでしょう。
実は正しく、「東京は晴れでサンフランシスコは雨である」と返答します。

これはかなり思い切った推測であるような気がします。なぜなら、LLMにとってはget_weather1の結果がsunnyでget_weather2の結果がrainyであることしかわからないはずで、どちらが東京かサンフランシスコの結果なのかわからないはずです。「過去のオレは最初に東京を聞いたはずだ」と推測してたまたま当たっているだけと言えると思います。

では、もう一つ別の実験として、get_weather1という関数名をget_weather_tyo、get_weather2という関数名をget_weather_sfoという関数名にrenameした状態で最後のLLMのリクエストのfunctions=functions_revを指定するとどうなるでしょう。functions_revにおいてget_weather_sfoのdescriptionがGet the weather of a past day in Tokyo、get_weather_tyoのdescriptionがGet the weather of a past day in San Franciscoという定義になります。

実は正しく、「東京は晴れでサンフランシスコは雨である」と返答します。
これはなかなか困った結果です。関数定義とは矛盾した推論を最後にしていることになります。

とはいえ、LLMは理屈に綺麗に従った理論的な行動をしてくれるわけではないので深入りはしない方が良いでしょう。むしろある意味人間らしいと言えるかもしれません。get_weather_tyoという関数名なら東京の天気を返す関数だろう、descriptionを確認するまでもないと言って人間も同じ過ちをしそうです。

考察

いくつか実験しましたが、教訓はシンプルです。

過去に呼び出した関数の関数定義はユーザに返答し終わるまでは必ず指定してあげよう。
関数名はちゃんと実体に則した名前にしましょう。

です。

上記の都合から、現状のAPIのインタフェースだと、関数A、関数B、関数Cがあった時、関数Aが呼ばれた次には関数B、関数Cからのみ選ばせたいといったことは基本的にできないと考えた方が良さそうです。

こういった関数呼び出しの履歴に基づいて選択できる関数を決定したいというケースは需要としてどれだけあるのかわかりませんが、仮にOpenAIがAPIとしてサポートしてくれるのであれば、意味合いが多少現状と変わってしまいますが、次に呼ぶ選択肢となる関数名の集合をfunction_callで指定できると、定義の集合と選択肢の集合を別管理できて良さそうだなと思いました。それに加えてmust_callというパラメータを新設して真偽で関数呼び出し必須か否か指定できれば現状の機能も維持できそうです。

LangChainのAgentでもOpenAI Function callingを使うことはできます。が、OpenAI Function callingを使うのであれば生のAPIをそのまま使えば良く、Agentとして使うメリットはあまりなさそうという所感です ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up