More than 1 year has passed since last update.

Qiita Engineer Festa20242024年7月17日まで開催中！

@moritalous

自分のAWS環境について何でも教えてくれるエージェントが作れそう（Agents for Amazon Bedrock + Knowledge bases for Amazon Bedrock）

Last updated at 2024-06-16Posted at 2024-06-16

動機

LangChainにPython REPLってのがある
ドキュメントには「ホストマシン上で動作するので気をつけてね」とある
ホストマシンで動かさないサンドボックスを作ればいいのねと思う
これってLambdaの出番じゃないのかい？と思いつく
やったら動いたぞ！わーい！
せっかくなのでAgents for Amazon Bedrockで動かしちゃおう
正確なPythonコードを提示する必要があるからKnowledge basesもおまけで付けちゃう

LangChainのPython REPLとは

Python REPLは、LangChainの「ツール」の一つです。

以下のような簡単な記述で、 Pythonで実際に動作させた結果を取得 することができます。

from langchain_core.tools import Tool
from langchain_experimental.utilities import PythonREPL

python_repl = PythonREPL()

python_repl.run("print(1+1)")

この例ではprint(1+1)の結果が取得できます。

'2\n'

ただし、LangChainが動作しているホストマシンの環境でPythonスクリプトを実行するので、誤って大事なファイルを消したりする危険性があります。

これをLambdaで動作させようと考えました。

Python REPLのLambda化

Python REPLの実装はとてもシンプルです。90行しかありません。

参考：GitHubのソースコード

Lambdaへ移植する際に気をつけたのは2点です。

動作に不必要なLangChainの依存を削除する
multiprocessing.QueueはLambdaで動作しないため、multiprocessing.Pipeに置き換える（参考）

出来上がったのが以下の3つの関数です。クラスですらなくしてしまいました。

import logging
from multiprocessing import Process, Pipe
import re
import sys
from io import StringIO
from typing import Dict, Optional

logger = logging.getLogger(__name__)

globals: Optional[Dict] = {}
locals: Optional[Dict] = {}


def sanitize_input(query: str) -> str:
    """Sanitize input to the python REPL.

    Remove whitespace, backtick & python
    (if llm mistakes python console as terminal)

    Args:
        query: The query to sanitize

    Returns:
        str: The sanitized query
    """
    query = re.sub(r"^(\s|`)*(?i:python)?\s*", "", query)
    query = re.sub(r"(\s|`)*$", "", query)
    return query


def worker(
    command: str,
    globals: Optional[Dict],
    locals: Optional[Dict],
    conn,
) -> None:
    old_stdout = sys.stdout
    sys.stdout = mystdout = StringIO()
    try:
        cleaned_command = sanitize_input(command)
        exec(cleaned_command, globals, locals)
        sys.stdout = old_stdout
        conn.send(mystdout.getvalue())
    except Exception as e:
        sys.stdout = old_stdout
        conn.send(repr(e))
    conn.close()


def run(command: str, timeout: Optional[int] = None) -> str:
    """Run command and returns anything printed.
    Timeout after the specified number of seconds."""

    parent_conn, child_conn = Pipe()

    # Only use multiprocessing if we are enforcing a timeout
    if timeout is not None:
        # create a Process
        p = Process(
            target=worker, args=(
                command, globals, locals, child_conn)
        )

        # start it
        p.start()

        # wait for the process to finish or kill it after timeout seconds
        p.join(timeout)

        if p.is_alive():
            p.terminate()
            return "Execution timed out"
    else:
        worker(command, globals, locals, child_conn)
    # get the result from the worker function
    return parent_conn.recv()

なんと、依存する外部ライブラリーがないので、Lambdaレイヤーも不要です。

Agents for Amazon Bedrockを作る

REPLを使ってなにかできないかなぁと考えて、「AWSの環境について問い合わせをするエージェント」を思いつきました。
LambdaにReadOnlyAccess権限を与えることで、何でも見られる（でも変更はできない）という状態を作りました。

細かな手順は省略しますが、以下のように設定しました。

エージェント全体の設定

エージェント向けの指示は以下のようにしました。

エージェント向けの指示

あなたはAWSの保守メンテナンスを行うエージェントです。
ユーザーからの質問に対して適切なアクションを行い回答を生成します。

AWSのリソースの情報を取得することはできますが、操作はできません。
例えばS3に格納されているオブジェクトの一覧を取得することはできますが、オブジェクトを作成したり削除することはできません。

ナレッジベースとして、EC2、IAMのマニュアルを参照することができるので、まずナレッジベースを読んで実施するコマンドを考えると良いでしょう。

他にはPythonのREPL環境が使えるので、考えたコマンドを実行できます。
Python以外の言語は実行できないので注意してください。
ユーザーへの質問対して、Pythonのコードを実行した結果を示すと喜ばれると思います。

アクショングループ

アクショングループのタイプはDefine with function detailsを選択し、パラメーターは一つだけ定義します。

設定項目	設定値
Name	command
Description	A Python script to run in a REPL environment.
Type	string
Required	True

commandでPythonのスクリプトを文字列として受け取る想定です。

Lambdaはクイック作成した後、先程のコードを反映しました。（タイムアウトの延長とIAM権限の付与もしました）

人に見せてはいけないぐらい汚いソースコードですが貼っておきます。

Lambdaのソースコード全文

dummy_lambda.py

import logging
from multiprocessing import Process, Pipe
import re
import sys
from io import StringIO
from typing import Dict, Optional

logger = logging.getLogger(__name__)

globals: Optional[Dict] = {}
locals: Optional[Dict] = {}


def sanitize_input(query: str) -> str:
    """Sanitize input to the python REPL.

    Remove whitespace, backtick & python
    (if llm mistakes python console as terminal)

    Args:
        query: The query to sanitize

    Returns:
        str: The sanitized query
    """
    query = re.sub(r"^(\s|`)*(?i:python)?\s*", "", query)
    query = re.sub(r"(\s|`)*$", "", query)
    return query


def worker(
    command: str,
    globals: Optional[Dict],
    locals: Optional[Dict],
    conn,
) -> None:
    old_stdout = sys.stdout
    sys.stdout = mystdout = StringIO()
    try:
        cleaned_command = sanitize_input(command)
        exec(cleaned_command, globals, locals)
        sys.stdout = old_stdout
        conn.send(mystdout.getvalue())
    except Exception as e:
        sys.stdout = old_stdout
        conn.send(repr(e))
    conn.close()


def run(command: str, timeout: Optional[int] = None) -> str:
    """Run command and returns anything printed.
    Timeout after the specified number of seconds."""

    parent_conn, child_conn = Pipe()

    # Only use multiprocessing if we are enforcing a timeout
    if timeout is not None:
        # create a Process
        p = Process(
            target=worker, args=(
                command, globals, locals, child_conn)
        )

        # start it
        p.start()

        # wait for the process to finish or kill it after timeout seconds
        p.join(timeout)

        if p.is_alive():
            p.terminate()
            return "Execution timed out"
    else:
        worker(command, globals, locals, child_conn)
    # get the result from the worker function
    return parent_conn.recv()

#####

import json

def lambda_handler(event, context):
    agent = event['agent']
    actionGroup = event['actionGroup']
    function = event['function']
    parameters = event.get('parameters', [])
    
    command = list(filter(lambda x: x["name"] == "command", parameters))[0]
    

    # Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
    responseBody =  {
        "TEXT": {
            "body": run(command=command["value"])
        }
    }

    action_response = {
        'actionGroup': actionGroup,
        'function': function,
        'functionResponse': {
            'responseBody': responseBody
        }

    }

    dummy_function_response = {'response': action_response, 'messageVersion': event['messageVersion']}
    print("Response: {}".format(dummy_function_response))

    return dummy_function_response

これでアクショングループは完成です。

ナレッジベース

ナレッジベースはKnowledge bases for Amazon Bedrockのデフォルト値で作成し、IAMのドキュメントにある「IAM ユーザーガイド」「IAM API リファレンス」を登録しました。

ナレッジベースのインストラクションはこちらです。

ナレッジベースのインストラクション

このナレッジベースは、AWSのIAMサービスのマニュアルです。ユーザーガイドとAPIリファレンスを含んでいます。ユーザーガイドの内容はPython以外の言語やCLI、管理画面の利用方法も含まれています。Pythonでの利用方法が知りた良い場合は、検索キーワードにPythonを加えると良いでしょう

出来上がりました。

やってみよう

質問：IAMユーザー名をすべて教えて

黒塗りにしましたが、正しくIAMユーザー名を取得しています。

トレースを追いかけます。

まず、ナレッジベースを検索します
検索結果がプロンプト中に反映されます
検索結果からPythonのコードを生成して、REPLを呼び出します
REPLの結果としてIAMユーザー名が取得できたので、ファイナルレスポンスが生成されます

他の質問でも回答が得られました。（あまり恥ずかしくない名称なので黒塗りなしで。。）

こちらのケースではステップが6まであり、以下のような動きでした。

「IAMロールの一覧を取得する方法」でナレッジベースを検索
ナレッジベースの結果生成
検索結果がお気に召さなかったようで、「Python IAMロールの一覧を取得する」で再度ナレッジベースを検索
検索結果を生成
期待する検索結果が得られたため、REPL呼び出しを実行
ファイナルレスポンス生成

エージェント感ありますね！

最後に宣伝

Bedrockの書籍を出版します。発売はあと10日後です！
興味を持っていただいた方は、どうぞお手に取ってください。

Amazon Bedrock 生成AIアプリ開発入門 [AWS深掘りガイド]

109

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up