More than 1 year has passed since last update.

KDDI株式会社

自作したChatGPT規格のプラグインをFunction callingを使って呼び出す方法

Last updated at 2023-07-23Posted at 2023-07-23

まえがき

前回、ChatGPT規格のプラグイン開発に触れてみました。
プラグインの呼び出し元であるAIオーケストレーションにはLangchainを採用して色々と試していましたが、エラーになる確率の方が高かったです、、。

Microsoftが掲げるCopilot stack × Pluginsの全体像に対して、前回行った内容を当てはめると以下のようなイメージでしょうか。

自作のプラグインを作成
Langchainから自作のプラグインを呼び出す
プラグイン連携はOpenAPI仕様に基づく

もう少し具体的な図で示すと以下のようなイメージです。

※プロンプトチューニングなどはしていないので精度向上の余地は十分に残されています。
具体的には、リクエスト先のURLが間違っていることが多かったですね。
そこを調整してあげればなんとかなりそうな印象でした。

プラグインとして2つの機能を用意していました。

ToDoリストの表示
ToDoの追加

これらの単純な機能を実現するのにあまりプロンプトチューニングはしたくないし、Langchainのソースコードを読むのも疲れるし。(実装の参考にしたのはLangchainなので結局ソースコード読んでるんですけどね、、)
といった背景から、「 Function callingで自分で実装するかあ 」という結論に至りました。
単純なタスクであれば意外と簡単に実装できます。

今回やることの全体像

先に全体像をまとめると、以下のようなイメージです。

ざっくりとした流れは以下の通りです。

①プラグインの機能やユースケースに関する情報を取得
②プラグイン仕様(openapi.yaml)を取得
③取得したプラグイン仕様に基づき、Function calling用の関数を生成
④Function calling実行

①に関しては今回未実装です。実装するとなると②と同じような処理内容となるかと思います。
ai-plugin.jsonの内容を取得し、「 こんな感じのプラグインだよ 」ということを基盤モデルに教えてあげるイメージです。
③は、openapi.yamlを入力するとFunction calling用の関数に変換してくれるようなプログラムで完結するので、別プラグインとの連携も楽になります。

以降では、②以降の処理についてコア部分をまとめていきます。

a. プラグイン仕様の取得

Langchainの実装を参考にしました。
やっていることは比較的単純で、プラグイン公開サーバからopenapi.yamlを取得してくるだけですね。

プラグイン仕様の取得

import requests
import yaml

base_url = "http://localhost:5000"
openapi_url = f"{base_url}/openapi.yaml"
response = requests.get(openapi_url)
openapi_yaml = response.text
openapi_data = yaml.safe_load(openapi_yaml)

b. プラグイン仕様を基にFunction calling用の関数を生成

ここが肝だと思っています。
取得したプラグイン仕様のデータを用いて、Function callingで使用可能な関数にしていきます。

Function callingを実装する上で、以下の2つの情報を定義することが必要となります。

Function callingで使用したい関数
- 実際の処理内容
関数のメタデータ
- 関数の名前、関数実行時の引数、いつ関数を使うかなどの情報
- OpenAI APIを実行する際、functions=に指定する情報です
- （メタデータと呼ぶのか分かりませんが、この記事では関数のメタデータと呼びます。）

私は実装する際、Function callingで使用したい関数をクラスとして記述しています。
LangchainのツールやSemantic Kernelのスキル(プラグイン)の実装がそうなっており、取り回しがよいため真似しています。
ただ、ここの実装はFunction callingを実行できれば何でも大丈夫です。

クラスで定義した際の実装例

# Function callingで使用したい機能を定義するクラス
class GetTodo:
    # 関数のメタデータを定義
    metadata = {
        "name": "getTodos",
        "description": "Get the list of todos",
        "parameters": {
        "type": "object",
        "properties": {
            "url": {
            "type": "string",
            "enum": ["http://localhost:5000/todos"],
            "description": "リクエスト先のURL"
            },
            "method": {
            "type": "string",
            "enum": ["GET"],
            "description": "HTTPリクエストのメソッド種別"
            }
        },
        "required": ["url", "method"]
        }
    }
    
    # 関数を定義
    def run(self, url:str, method:str):
        # 実際には適切な処理内容を記述
        return "getTodo results"
        

# Function callingで利用可能な状態にする
get_todo = GetTodo()
functions_metadata = [get_todo.metadata]
functions_callable = {get_todo.metadata["name"]: get_todo.run}

# システムのプロンプト
SYSTEM_PROMPT = """
あなたはユーザを助けるアシスタントです。
ユーザの入力に正しく回答を出力するために、ステップバイステップで慎重に考えることができます。
まずはゴール達成のためになにが必要かを考え、自分の思考と行動を説明します。
"""

# メッセージ
messages = [
    {"role":"system", "content": SYSTEM_PROMPT},
    {"role":"user", "content": "ToDoリストを表示したい"},
]

# Azure OpenAIの設定
openai.api_type = "azure"
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai.api_version = os.getenv("AZURE_OPENAI_API_VERSION")

# Function callingの実行
response = openai.ChatCompletion.create(
    engine = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
    messages = messages,
    functions=functions_metadata,
    function_call="auto",
    temperature=0,
)

# 関数の実行
msg = response["choices"][0]["message"]
func_name = msg["function_call"]["name"]
func_args = json.loads(msg["function_call"]["arguments"])
func_result = functions_callable[func_name](**func_args)

print(func_result)

以上を踏まえ、クラスを動的に生成することがこの手順のゴールです。
openapi.yamlを基に動的にクラス生成できれば、様々なプラグインを利用できるFunction callingが出来上がります。

大まかな処理内容は以下の通りです。

ベースとなるクラスを用意
openapi.yamlから必要な情報を抜き出す
Function calling用のクラスを生成

ベースとなるクラスを用意

ベースとなるクラスには、以下の2つの情報を定義しておきます。

Function callingで使用したい関数
- 実際の処理内容
関数のメタデータのベース
- 関数の名前、関数実行時の引数、いつ関数を使うかなどの情報

ChatGPT規格のプラグインの利用はAPI経由で行われます。
そのため、HTTPリクエストを送信できる関数(run())を定義しておきます。
引数は以下3つとしています。

url
- APIエンドポイント。
- enumを使用することで、必ずopenapi.yamlに記載のあるエンドポイントになるように。
method
- HTTPメソッド。
body
- リクエストボディ。

ベースとなるクラス

# function calling用関数のベースとなるクラス
class BaseClass:
    # 関数のメタデータのベース
    metadata_base = {
        "name": "",
        "description": "",
        "parameters": {
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "URL of the API endpoint",
                    "enum": []
                },
                "method": {
                    "type": "string",
                    "description": "HTTP method",
                    "enum": []
                },
                "body": {},
            },
            "required": [],
        },
    }

    def __init__(self):
        pass

    # ベースクラスに共通のrunメソッドを追加
    def run(url: str, method: str, body: dict = None) -> dict:
        if method == "GET":
            response = requests.get(url)
        elif method == "POST":
            response = requests.post(url, json=body)
        else:
            raise Exception("Not supported method")

        response_body = response.json()
        return response_body

このベースとなるクラスを利用して、Function calling用のクラスを生成していきます。

openapi.yamlから必要な情報を抜き出す

関数のメタデータを埋めるために、openapi.yamlから必要な情報を抜き出します。
ベースクラス内に定義したメタデータは以下のようになっています。

メタデータのベース

{
    "name": "",
    "description": "",
    "parameters": {
        "type": "object",
        "properties": {
            "url": {
                "type": "string",
                "description": "URL of the API endpoint",
                "enum": []
            },
            "method": {
                "type": "string",
                "description": "HTTP method",
                "enum": []
            },
            "body": {},
        },
        "required": [],
    },
}

上記JSONの空の部分を埋めるために必要な情報をopenapi.yamlから引っ張ってきます。
今回、openapi.yamlは以下のように定義しています。

openapi.yamlの定義

openapi: 3.0.1
info:
  title: TODO Plugin
  description: A plugin that allows the user to create and manage a TODO list using ChatGPT.
  version: 'v1'
servers:
  - url: http://localhost:5000
paths:
  /todos:
    get:
      operationId: getTodos
      summary: Get the list of todos
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/getTodoResponse'

    post:
      operationId: postTodo
      summary: Add a todo to the list
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/postTodoRequest'
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/postTodoResponse'

components:
  schemas:
    getTodoResponse:
      type: object
      properties:
        todos:
          type: array
          items:
            type: string
          description: The list of todos
    postTodoRequest:
      type: object
      properties:
        todo:
          type: string
          description: The todo to add
    postTodoResponse:
      type: object
      properties:
        todos:
          type: array
          items:
            type: string
          description: The list of todos

/todos
- GET：現在のToDoリストを表示
- POST：ToDoリストに項目を追加

メタデータを埋めることができるのであれば、処理内容は何でも大丈夫です。一例を以下に示します。

メタデータを作成するための関数を作成
- エンドポイント, メソッドの種類, メソッドの定義情報, スキーマ情報を用いてメタデータ生成
リクエストパスごとに複数のHTTPメソッドがあるので、関数呼び出し元はループ処理

メタデータの作成

# openapi.yamlを元にFunction calling用のメタデータを作成する関数
def create_function_metadata_from_openapi(
        endpoint: str, 
        method: str, 
        method_info: dict, 
        schemas: dict
    ) -> dict:

    # メタデータのベースをコピー
    metadata = dict(BaseClass.metadata_base)

    # メタデータ作成に必要なデータを取得
    name = method_info.get("operationId")
    description = method_info.get("summary")
    method = method.upper()
    if method == "POST":
        schema_ref = method_info.get("requestBody", {}).get("content", {}).get("application/json", {}).get("schema", {}).get("$ref", "")
        schema_name = schema_ref.split("/")[-1]
        body = schemas.get(schema_name, {})
        required_properties = ["url", "method", "body"]
    else:
        body = {}
        required_properties = ["url", "method"]

    # メタデータの作成
    metadata["name"] = name
    metadata["description"] = description
    metadata["parameters"]["properties"]["url"]["enum"] = [endpoint]
    metadata["parameters"]["properties"]["method"]["enum"] = [method]
    if body == {}: # bodyが空の場合は削除
        metadata["parameters"]["properties"].pop("body")
    else:
        metadata["parameters"]["properties"]["body"] = body
    metadata["parameters"]["required"] = required_properties

    return metadata


# openapi.yamlを指定したURLからダウンロード
base_url = "http://localhost:5000"
openapi_url = f"{base_url}/openapi.yaml"
response = requests.get(openapi_url)
openapi_yaml = response.text
openapi_data = yaml.safe_load(openapi_yaml)


# openapi.yamlを入力としてクラスを動的に生成
for path, methods in openapi_data.get("paths", {}).items():
    for method, method_info in methods.items():
        metadata = create_function_metadata_from_openapi(
            endpoint=servers[0] + path, 
            method=method,
            method_info=method_info,
            schemas=openapi_data.get("components", {}).get("schemas", {}),
        )

        print(metadata)

Function calling用のクラスを生成

上記手順までで、openapi.yamlに定義されているリクエストパスのメソッドごとにメタデータの作成ができました。

/todos
- GET：現在のToDoリストを表示　 → getTodosとしてメタデータ生成
- POST：ToDoリストに項目を追加　→ postTodoとしてメタデータ生成

openapi.yamlの定義

openapi: 3.0.1
info:
  title: TODO Plugin
  description: A plugin that allows the user to create and manage a TODO list using ChatGPT.
  version: 'v1'
servers:
  - url: http://localhost:5000
paths:
  /todos:
    get:
      operationId: getTodos
      summary: Get the list of todos
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/getTodoResponse'

    post:
      operationId: postTodo
      summary: Add a todo to the list
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/postTodoRequest'
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/postTodoResponse'

components:
  schemas:
    getTodoResponse:
      type: object
      properties:
        todos:
          type: array
          items:
            type: string
          description: The list of todos
    postTodoRequest:
      type: object
      properties:
        todo:
          type: string
          description: The todo to add
    postTodoResponse:
      type: object
      properties:
        todos:
          type: array
          items:
            type: string
          description: The list of todos

作成されたメタデータ

[
  {
    "name": "getTodos",
    "description": "Get the list of todos",
    "parameters": {
      "type": "object",
      "properties": {
        "url": {
          "type": "string",
          "description": "URL of the API endpoint",
          "enum": ["http://localhost:5000/todos"]
        },
        "method": {
          "type": "string",
          "description": "HTTP method",
          "enum": ["GET"]
        }
      },
      "required": ["url", "method"]
    }
  },
  {
    "name": "postTodo",
    "description": "Add a todo to the list",
    "parameters": {
      "type": "object",
      "properties": {
        "url": {
          "type": "string",
          "description": "URL of the API endpoint",
          "enum": ["http://localhost:5000/todos"]
        },
        "method": {
          "type": "string",
          "description": "HTTP method",
          "enum": ["POST"]
        },
        "body": {
          "properties": {
            "todo": {
              "description": "The todo to add",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "required": ["url", "method", "body"]
    }
  }
]

これらのメタデータとベースクラスを利用して、各クラスを作成します。
Pythonの組み込み関数type()を使用し、BaseClassを継承したクラスを辞書に格納しておきます。
その後、Function callingで利用可能な状態に整えます。

# openapi.yamlを入力としてクラスを動的に生成
for path, methods in openapi_data.get("paths", {}).items():
    for method, method_info in methods.items():
        metadata = create_function_metadata_from_openapi(
            endpoint=servers[0] + path, 
            method=method,
            method_info=method_info,
            schemas=openapi_data.get("components", {}).get("schemas", {}),
        )
        
        # Function calling用のクラスを個々に作成し、辞書に格納
        function_classes[metadata["name"]] = type(
            metadata["name"],
            (BaseClass,),
            {
                # copy()だと参照渡しになって上書きされてしまうので、deepcopy()を使用してコピー
                "metadata": copy.deepcopy(metadata),  
            },
        )


# Function callingで使用可能な状態にする
functions_metadata = [function_class.metadata for function_class in function_classes.values()]
functions_callable = {function_class.metadata["name"]: function_class.run for function_class in function_classes.values()}

c. Function callingを実行する

この手順はChatGPT規格のプラグインだからなにか変わるかと言われるとそうではなく、Function callingの仕様に合わせて実行していきます。

ポイントは以下の通りです。

使用したい関数群(functions_metadata)を指定し、OpenAI APIを呼び出す
関数実行が必要だと判断された場合、関数実行に必要な情報を取得する (func_name, func_args)
functions_callableの中から該当する関数を実行
繰り返し同じ関数が実行されないように実行結果をmessagesに追加することで進捗状況を保持

実装例

# Azure OpenAIの設定
openai.api_type = "azure"
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai.api_version = os.getenv("AZURE_OPENAI_API_VERSION")


# システムのプロンプト
SYSTEM_PROMPT = """
あなたはユーザを助けるアシスタントです。
ユーザの入力に正しく回答を出力するために、ステップバイステップで慎重に考えることができます。
まずはゴール達成のためになにが必要かを考え、自分の思考と行動を説明します。
"""

messages = [
    {"role":"system", "content": SYSTEM_PROMPT},
]


def exec_function_calling(user_input:str):
    # ユーザの入力をメッセージに追加
    messages.append({"role": "user", "content": user_input})

    # 推論実行
    response = openai.ChatCompletion.create(
        engine = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
        messages = messages,
        functions=functions_metadata,
        function_call="auto",
        temperature=0,
    )

    # 関数の呼び出し有無を確認
    while response["choices"][0]["message"].get("function_call"):
        msg = response["choices"][0]["message"]

        # 関数の呼び出し情報を取得
        func_name = msg["function_call"]["name"]
        func_args = json.loads(msg["function_call"]["arguments"])

        # 関数を呼び出し
        print(f"関数名： {func_name}")
        print(f"引数 ： {func_args}\n\n")
        func_result = functions_callable[func_name](**func_args)
        
        # 関数の実行結果をメッセージに追加
        status_msg = "関数:{}を実行\n実行結果:{}".format(func_name, func_result)
        messages.append(
            {
                "role": "function",
                "name": func_name, 
                "content": status_msg
            }
        )

        # 再度、推論実行
        response = openai.ChatCompletion.create(
            engine = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
            messages = messages,
            functions=functions_metadata,
            function_call="auto"
        )

    # 結果をメッセージに追加する
    result = response["choices"][0]["message"]["content"]
    messages.append({"role": "assistant", "content": result})    
    
    return result


# Function callingの実行
user_input = "Todoリストに「Qiitaに記事を書く」がない場合は追加したい"
result = exec_function_calling(user_input)
print(result)

私の環境では、実行結果は以下の通りとなりました。

### デバッグ用の出力 ###
関数名： getTodos
引数 ： {'url': 'http://localhost:5000/todos', 'method': 'GET'}


関数名： postTodo
引数 ： {'url': 'http://localhost:5000/todos', 'method': 'POST', 'body': {'todo': 'Qiitaに記事を書く'}}


### 実行結果 ###
まず、Todoリストを取得するためにgetTodos関数を実行します。
取得した結果、以下のようになりました。

\```
{
  "todos": [
    "todo1",
    "todo2",
    "todo3"
  ]
}
\```

次に、新しいTodoである「Qiitaに記事を書く」をTodoリストに追加するために、postTodo関数を実行します。
実行後のTodoリストは以下の通りです。

\```
{
  "todos": [
    "todo1",
    "todo2",
    "todo3",
    "Qiitaに記事を書く"
  ]
}
\```

これで、「Qiitaに記事を書く」が正しくTodoリストに追加されました。

プラグイン公開サーバのログも見てみます。よさそうですね。

まとめ

ChatGPT規格のプラグインをFunction callingから呼び出す機能を実装してみました。
「 他のプラグインを使いたい 」となった場合は今回実装した処理内容を変える必要はなく、プラグイン公開サーバのURLを変えるだけで済みます。
規格が統一されていることの強みですね。様々なプラグインを利用しやすくなります。

今回、.well-known/ai-plugin.jsonは考慮せずにプラグインを利用する構成となっています。
Langchainではその辺りも含めて実装されていたので、参考にしつつ色々試していこうかと思います。

ソースコードは以下に置いています。

おまけ

今回取り扱ったFunction calling用のクラスをChatGPTに生成してもらうためのプロンプトを遊びで作っていました。それも共有しようかと思います。精度はイマイチですね、、

ゴールは、「 openapi.yamlをユーザが入力したら、それに基づいてFunction calling用のクラスに該当するソースコードを生成する 」ことです。

入力：openapi.yaml
出力：Function calling用のクラスを記載したソースコード

以下のプロンプトをChatGPTにコピペして試してみてください

プロンプト

 - # ゴール
    - openapi.yamlの内容から、Function calling用のメタデータを作成する
 - # 出力例
    - class AddTodo:
          metadata = {
              "name": "postTodo",
              "description": "Add a todo to the list",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "url": {
                          "type": "string",
                          "enum": ["http://localhost:5000/add-todo"],
                          "description": "リクエスト先のURL"
                      },
                      "method": {
                          "type": "string",
                          "enum": ["POST"],
                          "description": "HTTPリクエストのメソッド種別"
                      },
                      "body": {
                          "type": "object",
                          "properties": {
                              "todo": {
                                  "type": "string",
                                  "description": "The todo to add"
                              }
                          },
                          "required": ["todo"]
                      }
                  },
                  "required": ["url", "method", "body"]
              }
          }
      
      
          def run(self, url:str, method:str, body:dict) -> dict:
              if method == "POST":
                  # リクエストの送信 (bodyはapplication/json形式で送信する)
                  response = requests.post(url, json=body)
                  
                  # レスポンスの取得
                  response_body = response.json()
                  return response_body
              else:
                  raise Exception("Not supported method")
 - # 実行のプロセス
    - 1. ユーザにopenapi.yamlの定義を尋ねる
    - 2. openapi.yamlの仕様を理解し、内容をユーザに確認する
    - 3. 各pathごとに成果物を作成する。
    - 4. フィードバックループに入る
 - # フィードバックループ
    - ユーザに成果物をチェックしてもらう
    - ユーザとの対話を通じて成果物を修正する
 - # 成果物
    - 出力例に示したようなFunction calling用のメタデータ
    - 出力はJSON形式で、name, description, parametersのフィールドを持ちます。
    - propertiesには、必ずurl, methodを含める
 - それでは、実行のプロセスに従って始めましょう。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up