LoginSignup
2
0

AWS SAMでKnowledge Bases for Amazon Bedrockを構築する

Last updated at Posted at 2024-05-01

はじめに

2024年4月に、Knowledge Bases for Amazon BedrockとAgents for Amazon BedrockがAWS CloudFormationによるデプロイをサポートしました。ユーザーガイドにAWS::Bedrockのリソースが記載されています。

このドキュメントをもとに、AWS SAMでの構築を試してみました。
2024年4月30日時点では、2024年4月23日にサポートされた複数のデータソース設定やデータソースが削除された際にベクターデータストア上のデータを保持するかどうかの設定には対応していないようです。

構成図

今回作成するAWS SAMのtemplate.yamlでは構成図にあるリソースのうち以下のリソースを作成します。

  • Amazon API Gateway
  • AWS Lambda
  • Knowledge Bases for Amazon Bedrock
  • DynamoDB

AWS Secrets Managerや、Amazon S3のバケットは作成済みのリソースを紐付けます。

構成図

参考情報

環境構築

前提条件

  • 以下のリソースを作成・削除・変更できる権限をもつAWSユーザーを利用すること
    • AWS IAM
    • AWS Lambda
    • AWS CloudFormation
    • AWS Secrets Manager
    • Amazon API Gateway
    • Amazon S3
    • Amazon CloudWatch Logs
    • Amazon Bedrock
      • Anthropic Claude 3 Sonnetが利用可能な状態
    • Amazon DynamoDB
  • 使用するAWSリージョンは、us-east-1
  • Slack Appを作成するためのアカウントや権限を持っている
  • Pineconeにサインイン可能なアカウント(Google, Github, Microsoftのいずれか)を持っている

PineconeセットアップとIndexの作成

前回の記事の、PineconeセットアップとIndexの作成の手順でPineconeをセットアップします。

ファイルをS3バケットにアップロード

前回の記事の、ファイルをS3バケットにアップロードの手順でS3バケットにファイルをアップロードします。

開発環境構築

作業環境のOSバージョン

Windows 11上のWSLでUbuntu 23.04を動かしています。

$ cat /etc/os-release | grep PRETTY_NAME
PRETTY_NAME="Ubuntu 23.04"

Python環境

$ python3 --version
Python 3.12.0
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip3 install --upgrade pip
$ pip3 --version
pip 24.0 from /home/xxx/.venv/lib/python3.12/site-packages/pip (python 3.12)

AWS環境構築

aws configureコマンドでデフォルトのリージョンやクレデンシャルを設定するか、もしくは~/.aws/configや~/.aws/credentialsを用意します。

AWS SAM CLIインストール

AWS上でサーバーレスアプリケーションを構築、実行するAWS SAMを使用します。

Installing the AWS SAM CLI の手順に従い、AWS SAM CLIをインストールします。今回はx86_64環境でLinux OSを使用するため、x86_64 - command line installerの手順を実行します。

$ sam --version
SAM CLI, version 1.115.0

アプリケーションの構築

ディレクトリ構造は以下のとおりです。

.
├── bedrock-slack-backlog-rag-app
│   ├── __init__.py
│   ├── app.py
│   └── requirements.txt
├── samconfig.toml
└── template.yaml

__init__.pyは空のファイルです。
bedrock-slack-backlog-rag-app/requirements.txtは以下のとおりです。boto3やrequestsも必要ですが、それらはLambdaレイヤーで追加するようtemplate.yamlに記述します。

slack-bolt
slack-sdk
langchain

template.yamlの構成

template.yaml (長いので折りたたんでいます。クリックして展開)
template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Slack Bedrock Assitant.

Resources:
  # Lambda function for Bedrock
  BedrockAssistantFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: bedrock-slack-backlog-rag-app/
      Handler: app.lambda_handler
      Runtime: python3.12
      Role: !GetAtt LambdaRole.Arn
      Timeout: 300
      MemorySize: 512
      Architectures:
        - arm64
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref DynamoDBTable
      Environment:
        Variables:
          SECRET_NAME: 'Bedrock-sam-secrets-backlog-rag' # Name of the secret in Secrets Manager
          REGION_NAME: 'us-east-1' # Region of the secret in Secrets Manager
          DYNAMODB_TABLE_NAME: !Ref DynamoDBTable
          KNOWLEDGE_BASE_ID: !Ref KnowledgeBaseWithPinecone

      Events:
        Slack:
          Type: Api
          Properties:
            Method: POST
            Path: /slack/events
      Layers:
        # Layer for AWS Parameter Store and Secrets Manager
        # https://docs.aws.amazon.com/systems-manager/latest/userguide/ps-integration-lambda-extensions.html#ps-integration-lambda-extensions-add
        - arn:aws:lambda:us-east-1:177933569100:layer:AWS-Parameters-and-Secrets-Lambda-Extension-Arm64:11
        # Layer for boto3
        # https://github.com/keithrozario/Klayers?tab=readme-ov-file#list-of-arns
        - arn:aws:lambda:us-east-1:770693421928:layer:Klayers-p312-arm64-boto3:1

  # DynamoDB Table for storing chat history
  DynamoDBTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: 'bedrock-slack-backlog-rag-app-chat-history'
      AttributeDefinitions:
        - AttributeName: 'SessionId'
          AttributeType: 'S'
      KeySchema:
        - AttributeName: 'SessionId'
          KeyType: 'HASH'
      BillingMode: PAY_PER_REQUEST

  # IAM Role for lambda.
  LambdaRole:
    Type: "AWS::IAM::Role"
    Properties:
      RoleName: bedrock-slack-backlog-rag-app-lambda-role
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: allow-lambda-invocation
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - lambda:InvokeFunction
                  - lambda:InvokeAsync
                Resource: "*"
        - PolicyName: SecretsManagerPolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action: 'secretsmanager:GetSecretValue' # Required for Lambda to retrieve the secret
                Resource: "*"
        - PolicyName: allow-bedrock-agent-access
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - bedrock:InvokeAgent
                  - bedrock:InvokeModel
                  - bedrock:Retrieve
                  - bedrock:InvokeModelWithResponseStream
                Resource: "*"
        - PolicyName: DynamoDBCrudPolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - dynamodb:PutItem
                  - dynamodb:GetItem
                  - dynamodb:UpdateItem
                  - dynamodb:DeleteItem
                Resource: "*"
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

  BacklogAssitantLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub /aws/lambda/${BedrockAssistantFunction}
      RetentionInDays: 14 # Optional. Default retention is 30 days.

  BedrockAccessPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: "AmazonBedrockFoundationModelPolicyForKnowledgeBase_ipa_documents"
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: BedrockListModelsStatement
            Effect: Allow
            Action:
              - 'bedrock:ListFoundationModels'
              - 'bedrock:ListCustomModels'
            Resource: '*'
          - Sid: BedrockInvokeModelStatement
            Effect: Allow
            Action:
              - 'bedrock:InvokeModel'
            Resource: 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1'
      Roles:
        - !Ref BedrockKnowledgeBaseRole

  S3AccessPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: "AmazonBedrockS3PolicyForKnowledgeBase_ipa_documents"
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: S3ObjectStatement
            Effect: Allow
            Action:
              - 's3:GetObject'
              - 's3:ListBucket'
            Resource:
              - 'arn:aws:s3:::foo-bar-bucket/*'  # ドキュメントを格納したS3バケットのARN。末尾に /* あり。
              - 'arn:aws:s3:::foo-bar-bucket/
            Condition:
              StringEquals:
                aws:PrincipalAccount: !Ref AWS::AccountId
      Roles:
        - !Ref BedrockKnowledgeBaseRole

  BedrockKnowledgeBaseRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: "AmazonBedrockExecutionRoleForKnowledgeBase_ipa_documents"
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: AmazonBedrockKnowledgeBaseTrustPolicy
            Effect: Allow
            Principal:
              Service: [bedrock.amazonaws.com]
            Action: ['sts:AssumeRole']
            Condition:
              StringEquals:
                aws:ResourceAccount: !Ref AWS::AccountId
              ArnLike:
                aws:SourceArn: !Sub "arn:${AWS::Partition}:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/*"
      Policies:
        - PolicyName: AmazonBedrockSecretsPolicyForKnowledgeBase_ipa_documents
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Sid: SecretsAccessPolicy
                Effect: Allow
                Action: ['secretsmanager:GetSecretValue']
                Resource: !Sub 'arn:aws:secretsmanager:us-east-1:${AWS::AccountId}:secret:bedrock-pinecone-serverless-apikey-20240117-xxxxx' # SecretManagerのARN
                Condition:
                  StringEquals:
                    aws:ResourceAccount: !Ref AWS::AccountId

  KnowledgeBaseWithPinecone:
    Type: AWS::Bedrock::KnowledgeBase
    Properties:
      Name: "knowledge-base-ipa-documents-v20240501-01"
      Description: "独立行政法人 情報処理推進機構 IPAによる\"安全なウェブサイトの作り方\"に関するドキュメント。"
      KnowledgeBaseConfiguration:
          Type: VECTOR
          VectorKnowledgeBaseConfiguration:
              EmbeddingModelArn: 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1'
      RoleArn: !GetAtt BedrockKnowledgeBaseRole.Arn
      StorageConfiguration:
        Type: PINECONE
        PineconeConfiguration:
            ConnectionString: 'https://bedrock-pinecone-serverless-ipa-documents-z2bueuz.svc.aped-4627-b74a.pinecone.io'
            CredentialsSecretArn: !Sub 'arn:aws:secretsmanager:us-east-1:${AWS::AccountId}:secret:bedrock-pinecone-serverless-apikey-20240117-xxxxxx' # SecretManagerのARN
            FieldMapping:
              MetadataField: "metadata"
              TextField: "text"

  DataSource:
    Type: AWS::Bedrock::DataSource
    Properties:
      KnowledgeBaseId: !Ref KnowledgeBaseWithPinecone
      Name: "knowledge-base-data-source-ipa-documents"
      Description: "Data Source"
      DataSourceConfiguration:
        Type: S3
        S3Configuration:
          BucketArn: 'arn:aws:s3:::net.rev-system.bedrock-pinecone-serverless-ipa-documents'
      VectorIngestionConfiguration:
        ChunkingConfiguration:
          ChunkingStrategy: FIXED_SIZE
          FixedSizeChunkingConfiguration:
            MaxTokens:  512
            OverlapPercentage: 25

Outputs:
  BedrockAssistantApi:
    Description: "The URL of Slack Event Subscriptions"
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/slack/events"
  BedrockAssistantFunction:
    Description: "Bedrock Assistant Lambda Function ARN"
    Value: !GetAtt BedrockAssistantFunction.Arn
  BedrockAssistantFunctionIamRole:
    Description: "Implicit IAM Role created for Bedrock Assistant function"
    Value: !GetAtt LambdaRole.Arn
  BedrockKnowledgeBaseId:
    Value: !Ref KnowledgeBaseWithPinecone
  BedrockDataSourceId:
    Value: !Ref DataSource

AWS SAM テンプレートファイル(template.yaml)に、作成するAWSリソースを定義します。
Lambda関数用ロールやポリシー、Lambdaの環境変数などを記述します。その他に、以下のレイヤーやリソースベースポリシーが含まれます。

  • Lambda関数からSecrets ManagerにアクセスするためのAWS-Parameters-and-Secrets-Lambda-Extensionレイヤー
  • Lambda関数内からimportするためのboto3をパッケージにしたレイヤー
  • BedrockkからLambda関数を扱うためのリソースベースポリシー
  • DynamoDBの操作を許可するポリシー
  • DynamoDBテーブルの作成とSessionIdをプライマリキーに設定
  • Knowledge Bases for Amazon BedrockとData Source

Knowledge Bases for Amazon Bedrockに必要なIAMポリシーをCustome Managed Policyで以下のように定義します。

Bedrockに必要なIAMポリシー
  BedrockAccessPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: "AmazonBedrockFoundationModelPolicyForKnowledgeBase_ipa_documents"
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: BedrockListModelsStatement
            Effect: Allow
            Action:
              - 'bedrock:ListFoundationModels'
              - 'bedrock:ListCustomModels'
            Resource: '*'
          - Sid: BedrockInvokeModelStatement
            Effect: Allow
            Action:
              - 'bedrock:InvokeModel'
            Resource: 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1'
      Roles:
        - !Ref BedrockKnowledgeBaseRole

  S3AccessPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: "AmazonBedrockS3PolicyForKnowledgeBase_ipa_documents"
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: S3ObjectStatement
            Effect: Allow
            Action:
              - 's3:GetObject'
              - 's3:ListBucket'
            Resource:
              - 'arn:aws:s3:::foo-bar-bucket/*'  # ドキュメントを格納したS3バケットのARN。末尾に /* あり。
              - 'arn:aws:s3:::foo-bar-bucket/
            Condition:
              StringEquals:
                aws:PrincipalAccount: !Ref AWS::AccountId
      Roles:
        - !Ref BedrockKnowledgeBaseRole

上記で定義したCustome Managed PolicyをIAMロールにアタッチします。ここで、AmazonBedrockSecretsPolicyForKnowledgeBase_ipa_documentsを上記のようにCustome Managed Policyで定義すると以下のエラーが発生したため、インラインポリシーでロールにアタッチしています。他にもっと良い定義方法があるかもしれません。

発生したエラー
Resource handler returned message: "The knowledge base storage configuration provided is invalid... User:arn:aws:sts::xxxxxxxxxxxx:assumed-role/AmazonBedrockExecutionRoleForKnowledgeBase_ipa_documents/BedrockKnowledgeBaseCPSession-X7DT09FS4Y is not authorized to perform:secretsmanager:GetSecretValue on resource: {SecretManagerのARN} because no identity-based policy allows the secretsmanager:GetSecretValueaction"

Bedrockに必要なIAMロール
  BedrockKnowledgeBaseRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: "AmazonBedrockExecutionRoleForKnowledgeBase_ipa_documents"
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: AmazonBedrockKnowledgeBaseTrustPolicy
            Effect: Allow
            Principal:
              Service: [bedrock.amazonaws.com]
            Action: ['sts:AssumeRole']
            Condition:
              StringEquals:
                aws:ResourceAccount: !Ref AWS::AccountId
              ArnLike:
                aws:SourceArn: !Sub "arn:${AWS::Partition}:bedrock:${AWS::Region}:${AWS::AccountId}:knowledge-base/*"
      Policies:
        - PolicyName: AmazonBedrockSecretsPolicyForKnowledgeBase_ipa_documents
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Sid: SecretsAccessPolicy
                Effect: Allow
                Action: ['secretsmanager:GetSecretValue']
                Resource: 'arn:aws:secretsmanager:us-east-1:531713114752:secret:bedrock-pinecone-serverless-apikey-20240117-xxxxx' # SecretManagerのARN

Bedrockを以下のように定義します。埋め込みモデルのARNをEmbeddingModelArnに記述します。PineconeConfigurationにPineconeの接続情報を記述します。

Bedrockの定義
  KnowledgeBaseWithPinecone:
    Type: AWS::Bedrock::KnowledgeBase
    Properties:
      Name: "knowledge-base-ipa-documents-v20240429-01"
      Description: "独立行政法人 情報処理推進機構 IPAによる\"安全なウェブサイトの作り方\"に関するドキュメント。"
      KnowledgeBaseConfiguration:
          Type: VECTOR
          VectorKnowledgeBaseConfiguration:
              EmbeddingModelArn: 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1'
      RoleArn: !GetAtt BedrockKnowledgeBaseRole.Arn
      StorageConfiguration:
        Type: PINECONE
        PineconeConfiguration:
            ConnectionString: 'Pinecone IndexのHOSTアドレス'
            CredentialsSecretArn: 'PineconeのAPI Keyを登録したSecrets ManagerのARN'
            FieldMapping:
              MetadataField: "metadata"
              TextField: "text"

データソースを以下のように定義します。S3ConfigurationのBucketArnに、S3バケットのARNを記述します。チャンク設定は、ChunkingConfiguration以下に設定します。ここは、データソース作成時のみ設定可能です。作成後は変更できません。

データソースの定義
  DataSource:
    Type: AWS::Bedrock::DataSource
    Properties:
      KnowledgeBaseId: !Ref KnowledgeBaseWithPinecone
      Name: "knowledge-base-data-source-ipa-documents"
      Description: "Data Source"
      DataSourceConfiguration:
        Type: S3
        S3Configuration:
          BucketArn: 'arn:aws:s3:::foo-bar-bucket'  # ドキュメントを格納したS3バケットのARN
      VectorIngestionConfiguration:
        ChunkingConfiguration:
          ChunkingStrategy: FIXED_SIZE
          FixedSizeChunkingConfiguration:
            MaxTokens:  512
            OverlapPercentage: 25

Lambdaレイヤーは以下のようにtemplate.yamlに記述しています。

      Layers:
        # Layer for AWS Parameter Store and Secrets Manager
        # https://docs.aws.amazon.com/systems-manager/latest/userguide/ps-integration-lambda-extensions.html#ps-integration-lambda-extensions-add
        - arn:aws:lambda:us-east-1:177933569100:layer:AWS-Parameters-and-Secrets-Lambda-Extension-Arm64:11
        # Layer for boto3
        # https://github.com/keithrozario/Klayers?tab=readme-ov-file#list-of-arns
        - arn:aws:lambda:us-east-1:770693421928:layer:Klayers-p312-boto3:4

適用すると、Lambda関数のLayersに以下のように表示されます。

Lambdaレイヤー

リソースベースポリシーは、以下のようにtemplate.yamlに記述しています。

  BacklogSearchFunction:
    Type: AWS::Serverless::Function
    Properties:

(途中省略)

  # Resouse based policy for lambda.
  PermissionForBacklogSearchToInvokeLambda:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !GetAtt BacklogSearchFunction.Arn
      Action: lambda:InvokeFunction
      Principal: bedrock.amazonaws.com

適用すると、Lambda関数の設定のResource-based policy statementsに以下のように表示されます。

リソースベースのポリシー

Knowledge baseが完了すると、このようにKnowledge base IDが表示されます。
スクリーンショット 2024-04-26 002219.png

template.yaml内のEnvironmentにあるSECRET_NAMEREGION_NAMEには、それぞれ先ほど作成したSecrets Managerのシークレットの名前とリージョンを設定します。

samconfig.tomlの構成

samconfig.toml (長いので折りたたんでいます。クリックして展開)
samconfig.toml
# More information about the configuration file can be found here:
# https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
version = 0.1

[default]
[default.global.parameters]
stack_name = "bedrock-slack-backlog-rag-app"

[default.build.parameters]
cached = true
parallel = true

[default.validate.parameters]
lint = true

[default.deploy.parameters]
capabilities = "CAPABILITY_NAMED_IAM"
confirm_changeset = true
resolve_s3 = true
region = "us-east-1"

[default.package.parameters]
resolve_s3 = true

[default.sync.parameters]
watch = true

[default.local_start_api.parameters]
warm_containers = "EAGER"

[default.local_start_lambda.parameters]
warm_containers = "EAGER"

SAM CLIの実行設定ファイル(samconfig.toml)に、SAM CLIを実行する際の設定を定義します。AWS SAMのチュートリアル: Hello World アプリケーションのデプロイを実行した際に作成されるsamconfig.tomlをもとにしています。今回の例では、以下の点を変更しています。

  • [default.global.parameters]セクションのstack_nameを"sam-app"から"bedrock-slack-backlog-rag-app"に変更
  • [default.deploy.parameters]セクションにregion指定を追加
  • [default.deploy.parameters]セクションのcapabilitiesを"CAPABILITY_IAM"から"CAPABILITY_NAMED_IAM"に変更

bedrock-slack-backlog-rag-app/app.pyの構成

bedrock-slack-app/app.p (長いので折りたたんでいます。クリックして展開)
bedrock-slack-backlog-rag-app/app.py
import ast
import logging
import os
import re
import time
from typing import Any

import boto3
from botocore.exceptions import ClientError
from langchain.callbacks.base import BaseCallbackHandler
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import (
    PromptTemplate,
)
from langchain.retrievers import AmazonKnowledgeBasesRetriever
from langchain.schema import LLMResult
from langchain_community.chat_message_histories import DynamoDBChatMessageHistory
from langchain_community.chat_models import BedrockChat
from slack_bolt import App
from slack_bolt.adapter.aws_lambda import SlackRequestHandler

CHAT_UPDATE_INTERVAL_SEC = 1

SlackRequestHandler.clear_all_log_handlers()
logging.basicConfig(
    format="%(asctime)s [%(levelname)s] %(message)s",
    level=logging.DEBUG
)

logger = logging.getLogger(__name__)

REGION_NAME = "us-east-1"
MODEL_ID = "anthropic.claude-3-sonnet-20240229-v1:0"


class SecretsManager:
    """
    Class to retrieve secrets from Secrets Manager

    Attributes:
        secret_name (str): The name of the secret
        region_name (str): The name of the region
        client (boto3.client): The client for Secrets Manager
    """

    def __init__(self, secret_name, region_name):
        self.secret_name = secret_name
        self.region_name = region_name
        self.client = boto3.client(
            service_name='secretsmanager',
            region_name=region_name
        )

    def get_secret(self, key):
        """
        Retrieves the value of a secret based on the provided key.

        Args:
            key (str): The key of the secret to retrieve.

        Returns:
            str: The value of the secret.

        Raises:
            ClientError: If there is an error retrieving the secret.
        """
        try:
            get_secret_value_response = self.client.get_secret_value(
                SecretId=self.secret_name
            )
        except ClientError as e:
            raise e

        secret_data = get_secret_value_response['SecretString']
        secret = ast.literal_eval(secret_data)

        return secret[key]


secrets_manager = SecretsManager(
    secret_name=os.environ.get("SECRET_NAME"),
    region_name=os.environ.get("REGION_NAME")
)

app = App(
    signing_secret=secrets_manager.get_secret("SlackSigningSecret"),
    token=secrets_manager.get_secret("SlackBotToken"),
    process_before_response=True,
)


class SlackStreamingCallbackHandler(BaseCallbackHandler):
    """
    A callback handler for handling events during Slack streaming.

    Attributes:
        last_send_time (float): The timestamp of the last message sent.
        message (str): The accumulated message to be sent.

    Args:
        channel (str): The Slack channel to send messages to.
        ts (str): The timestamp of the message to be updated.
    """

    last_send_time = time.time()
    message = ""

    def __init__(self, userid, channel, ts):
        self.userid = userid
        self.channel = channel
        self.ts = ts
        self.interval = CHAT_UPDATE_INTERVAL_SEC
        self.update_count = 0

    def on_llm_new_token(self, token: str, **kwargs) -> None:
        """
        Event handler for a new token received.

        Args:
            token (str): The new token received.
            **kwargs: Additional keyword arguments.
        """

        self.message += token

        now = time.time()
        if now - self.last_send_time > self.interval:

            # mention_message = f"<@{self.userid}> {self.message}"
            # message_blocks = create_message_blocks(mention_message)

            app.client.chat_update(
                channel=self.channel,
                ts=self.ts,
                text=f"<@{self.userid}> {self.message}",
                # blocks=message_blocks
            )
            self.last_send_time = now
            self.update_count += 1

            if self.update_count / 10 > self.interval:
                self.interval = self.interval * 2

    def on_llm_end(self, response: LLMResult, **kwargs: Any) -> Any:
        """
        Event handler for the end of Slack streaming.

        Args:
            response (LLMResult): The result of the Slack streaming.
            **kwargs: Additional keyword arguments.

        Returns:
            Any: The result of the event handling.
        """

        mention_message = f"<@{self.userid}> {self.message}"
        message_blocks = create_message_blocks(mention_message)

        app.client.chat_update(
            channel=self.channel,
            ts=self.ts,
            text=self.message,
            blocks=message_blocks
        )


def create_message_blocks(text):
    """
    Creates the message blocks for updating the Slack message.

    Args:
        text (str): The updated text for the Slack message.

    Returns:
        list: The message blocks for updating the Slack message.
    """

    message_context = "Claude 3 Sonnetで生成される情報は不正確な場合があります。"
    message_blocks = [
        {
            "type": "section",
            "text":
                {
                    "type": "mrkdwn",
                    "text": text
                }
        },
        {
            "type": "divider"
        },
        {
            "type": "context",
            "elements": [
                {
                    "type": "mrkdwn",
                    "text": message_context
                }
            ]
        },
    ]

    return message_blocks


def prompt_template():
    """
    Returns the prompt template for the chat.

    Returns:
        str: The prompt template.
    """

    chat_template = """
    Let's think step by step.
    Take a deep breath.

    Answer the question based on the context below.
    And also, follow the rules below.

    This is rules for chat:
    Answer in Japanese if the question is asked in Japanese.
    If you cannot answer a question due to lack of specificity, please advise on how to ask the question.


    This is your context:
    {context}

    Question: {question}
    Answer:
    """

    return PromptTemplate(
        input_variables=["context", "question"],
        template=chat_template
    )


def system_instruction_template():
    """
    Returns the system instruction template for the chat.

    Returns:
        str: The system instruction template.
    """

    # Define your system instruction
    system_instruction = "The assistant should provide detailed explanations."

    # Define your template with the system instruction
    template = (
        f"{system_instruction} "
        "Combine the chat history and follow up question into "
        "a standalone question. Chat History: {chat_history}"
        "Follow up question: {question}"
    )

    # Create the prompt template
    return PromptTemplate.from_template(template)


def get_bedrock_knowledge_base(knowledge_base_id, region_name):

    return AmazonKnowledgeBasesRetriever(
            knowledge_base_id=knowledge_base_id,
            region_name=region_name,
            retrieval_config={
                "vectorSearchConfiguration": {
                    "numberOfResults": 5
                }
            }
        )


def get_bedrock_llm(model_id, region_name, callback: SlackStreamingCallbackHandler):

    return BedrockChat(
            model_id=model_id,
            region_name=region_name,
            streaming=True,
            callbacks=[callback],
            model_kwargs={
                "max_tokens": 500,
                "temperature": 0.99,
                "top_p": 0.999
            },
            verbose=True
        )


def get_chain_bedrock_knowledge_base(llm, memory, knowledge_base_id, region_name):
    retriever = get_bedrock_knowledge_base(
                knowledge_base_id=knowledge_base_id,
                region_name=region_name
            )

    return ConversationalRetrievalChain.from_llm(
            llm=llm,
            retriever=retriever,
            chain_type="stuff",  # Or "refine" | "map_reduce"
            memory=memory,
            return_source_documents=True,
            # Prompt template for generated question .
            condense_question_prompt=system_instruction_template(),
            # Prompt template for combining documents.
            combine_docs_chain_kwargs={'prompt': prompt_template()},
            get_chat_history=lambda h: h,
            verbose=True,
            # Rephrase the question before asking the knowledge base.
            rephrase_question=False
        )


def handle_app_mentions(event, say):
    """
    Handle app mentions in Slack.

    Args:
        event (dict): The event data containing information about the mention.
        say (function): The function used to send a message in Slack.

    Returns:
        None
    """

    channel = event["channel"]
    thread_ts = event["ts"]
    input_text = re.sub("<@.*>", "", event["text"])
    userid = event["user"]

    # セッションIDとして、thread_tsを使用
    # 初回はevent["ts"]を使用、以降はevent["thread_ts"]を使用
    id_ts = event["ts"]
    if "thread_ts" in event:
        id_ts = event["thread_ts"]

    result = say("\n\nお待ちください...", thread_ts=thread_ts)
    ts = result["ts"]

    history = DynamoDBChatMessageHistory(
            table_name=os.environ.get('DYNAMODB_TABLE_NAME'),
            session_id=id_ts,
            ttl=3600
        )

    callback = SlackStreamingCallbackHandler(
                    userid=userid,
                    channel=channel,
                    ts=ts
                )

    llm = get_bedrock_llm(
            model_id=MODEL_ID,
            region_name=REGION_NAME,
            callback=callback
            )

    memory = ConversationBufferMemory(
                chat_memory=history,
                input_key="question",
                memory_key="chat_history",
                output_key="answer",
                # Return messages in the memory as list.
                return_messages=True,
                human_prefix="H",
                assistant_prefix="A"
            )

    chain = get_chain_bedrock_knowledge_base(
                llm=llm,
                memory=memory,
                knowledge_base_id=os.environ.get(KNOWLEDGE_BASE_ID),
                region_name=REGION_NAME
            )

    result = chain.invoke(
                {
                    "question": input_text,
                    "chat_history": memory.chat_memory.messages
                }
            )

    source_documents = result.get('source_documents')
    uri, score, references = "", "", ""
    for i, refs in enumerate(source_documents):
        count = i + 1
        uri = refs.metadata['location']['s3Location']['uri']
        score = round(refs.metadata['score'] * 100, 2)
        text = re.sub(r"[\n\s]+", "", refs.page_content[:40])

        references += f'[{count}] <{uri}|{text}...>' + "  " + f"(関連度: {score}%)\n"

    say("[参照情報]\n\n" + references, thread_ts=thread_ts)


def respond_to_slack_within_3_seconds(ack):
    """
    Responds to a Slack message within 3 seconds.

    Parameters:
    - ack: A function to acknowledge the Slack message.

    Returns:
    None
    """
    ack()


app.event("app_mention")(
    ack=respond_to_slack_within_3_seconds,
    lazy=[handle_app_mentions]
)


def lambda_handler(event, context):
    """
    Lambda function handler for processing Slack events.

    Args:
        event (dict): The event data passed to the Lambda function.
        context (object): The runtime information of the Lambda function.

    Returns:
        dict: The response data to be returned by the Lambda function.
    """
    print(event)

    retry_counts = event.get("multiValueHeaders", {}).get("X-Slack-Retry-Num", [0])

    if retry_counts[0] != 0:
        logging.info("Skip slack retrying(%s).", retry_counts)
        return {}

    slack_handler = SlackRequestHandler(app=app)
    return slack_handler.handle(event, context)

ビルド

template.yamlがあるディレクトリで、ビルドコマンドを実行します。

$ sam build

ビルドに成功すると、以下のようなメッセージが表示されます。

Starting Build use cache
Manifest is not changed for (BedrockAssistantFunction), running incremental build
Building codeuri:
/home/xxxx/aws-sam-bedrock-slack-backlog-help-rag-app/bedrock-slack-backlog-rag-app runtime:
python3.12 metadata: {} architecture: arm64 functions: BedrockAssistantFunction
 Running PythonPipBuilder:CopySource
 Running PythonPipBuilder:CopySource

Build Succeeded

Built Artifacts  : .aws-sam/build
Built Template   : .aws-sam/build/template.yaml

Commands you can use next
=========================
[*] Validate SAM template: sam validate
[*] Invoke Function: sam local invoke
[*] Test Function in the Cloud: sam sync --stack-name {{stack-name}} --watch
[*] Deploy: sam deploy --guided
>>> elapsed time 19s

デプロイ

ビルドでエラーがなければsam deployコマンドを実行し、デプロイを行います。

$ sam deploy

デプロイが成功すると、以下のような情報がコンソールに出力されます。
様々なリソースが作成されました。

❯ sam deploy                                                                                                        [22:46:27]

Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-xxxxxxxxxxxx
A different default S3 bucket can be set in samconfig.toml
Or by specifying --s3-bucket explicitly.
Uploading to 8559dbd55989bf3a043fe8b1d74dbe38  27454180 / 27454180  (100.00%)

Deploying with following values
===============================
Stack name                   : bedrock-slack-backlog-rag-app
Region                       : us-east-1
Confirm changeset            : True
Disable rollback             : False
Deployment s3 bucket         : aws-sam-cli-managed-default-samclisourcebucket-xxxxxxxxxxxx
Capabilities                 : ["CAPABILITY_NAMED_IAM"]
Parameter overrides          : {}
Signing Profiles             : {}

Initiating deployment
=====================

Uploading to 906e473620d1e23df7adfb971d83aac2.template  9024 / 9024  (100.00%)


Waiting for changeset to be created..

CloudFormation stack changeset
-----------------------------------------------------------------------------------------------------------------------------
Operation                       LogicalResourceId               ResourceType                    Replacement
-----------------------------------------------------------------------------------------------------------------------------
+ Add                           BacklogAssitantLogGroup         AWS::Logs::LogGroup             N/A
+ Add                           BedrockAccessPolicy             AWS::IAM::ManagedPolicy         N/A
+ Add                           BedrockAssistantFunctionSlack   AWS::Lambda::Permission         N/A
                PermissionProd
+ Add                           BedrockAssistantFunction        AWS::Lambda::Function           N/A
+ Add                           BedrockKnowledgeBaseRole        AWS::IAM::Role                  N/A
+ Add                           DataSource                      AWS::Bedrock::DataSource        N/A
+ Add                           DynamoDBTable                   AWS::DynamoDB::Table            N/A
+ Add                           KnowledgeBaseWithPinecone       AWS::Bedrock::KnowledgeBase     N/A
+ Add                           LambdaRole                      AWS::IAM::Role                  N/A
+ Add                           S3AccessPolicy                  AWS::IAM::ManagedPolicy         N/A
+ Add                           ServerlessRestApiDeploymenta8   AWS::ApiGateway::Deployment     N/A
                73fedfca
+ Add                           ServerlessRestApiProdStage      AWS::ApiGateway::Stage          N/A
+ Add                           ServerlessRestApi               AWS::ApiGateway::RestApi        N/A
-----------------------------------------------------------------------------------------------------------------------------


Changeset created successfully. arn:aws:cloudformation:us-east-1:xxxxxxxxxxxx:changeSet/samcli-deploy1714571221/0379b200-4130-4af7-b18e-xxxxxxxxxxxx


Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y

2024-05-01 22:48:13 - Waiting for stack create/update to complete

CloudFormation events from stack operations (refresh every 5.0 seconds)
-----------------------------------------------------------------------------------------------------------------------------
ResourceStatus                  ResourceType                    LogicalResourceId               ResourceStatusReason
-----------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS              AWS::CloudFormation::Stack      bedrock-slack-backlog-rag-app   User Initiated
CREATE_IN_PROGRESS              AWS::DynamoDB::Table            DynamoDBTable                   -
CREATE_IN_PROGRESS              AWS::IAM::Role                  BedrockKnowledgeBaseRole        -
CREATE_IN_PROGRESS              AWS::IAM::Role                  LambdaRole                      -
CREATE_IN_PROGRESS              AWS::IAM::Role                  BedrockKnowledgeBaseRole        Resource creation Initiated
CREATE_IN_PROGRESS              AWS::IAM::Role                  LambdaRole                      Resource creation Initiated
CREATE_IN_PROGRESS              AWS::DynamoDB::Table            DynamoDBTable                   Resource creation Initiated
CREATE_COMPLETE                 AWS::DynamoDB::Table            DynamoDBTable                   -
CREATE_COMPLETE                 AWS::IAM::Role                  BedrockKnowledgeBaseRole        -
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         S3AccessPolicy                  -
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         BedrockAccessPolicy             -
CREATE_COMPLETE                 AWS::IAM::Role                  LambdaRole                      -
CREATE_IN_PROGRESS              AWS::Bedrock::KnowledgeBase     KnowledgeBaseWithPinecone       -
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         S3AccessPolicy                  Resource creation Initiated
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         BedrockAccessPolicy             Resource creation Initiated
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         S3AccessPolicy                  Eventual consistency check
                                                                                initiated
CREATE_IN_PROGRESS              AWS::Bedrock::KnowledgeBase     KnowledgeBaseWithPinecone       Resource creation Initiated
CREATE_IN_PROGRESS              AWS::IAM::ManagedPolicy         BedrockAccessPolicy             Eventual consistency check
                                                                                initiated
CREATE_COMPLETE                 AWS::Bedrock::KnowledgeBase     KnowledgeBaseWithPinecone       -
CREATE_IN_PROGRESS              AWS::Bedrock::DataSource        DataSource                      -
CREATE_IN_PROGRESS              AWS::Lambda::Function           BedrockAssistantFunction        -
CREATE_IN_PROGRESS              AWS::Bedrock::DataSource        DataSource                      Resource creation Initiated
CREATE_COMPLETE                 AWS::Bedrock::DataSource        DataSource                      -
CREATE_COMPLETE                 AWS::IAM::ManagedPolicy         S3AccessPolicy                  -
CREATE_COMPLETE                 AWS::IAM::ManagedPolicy         BedrockAccessPolicy             -
CREATE_IN_PROGRESS              AWS::Lambda::Function           BedrockAssistantFunction        Resource creation Initiated
CREATE_IN_PROGRESS              AWS::Lambda::Function           BedrockAssistantFunction        Eventual consistency check
                                                                                initiated
CREATE_IN_PROGRESS              AWS::Logs::LogGroup             BacklogAssitantLogGroup         -
CREATE_IN_PROGRESS              AWS::ApiGateway::RestApi        ServerlessRestApi               -
CREATE_IN_PROGRESS              AWS::Logs::LogGroup             BacklogAssitantLogGroup         Resource creation Initiated
CREATE_IN_PROGRESS              AWS::ApiGateway::RestApi        ServerlessRestApi               Resource creation Initiated
CREATE_COMPLETE                 AWS::ApiGateway::RestApi        ServerlessRestApi               -
CREATE_IN_PROGRESS              AWS::ApiGateway::Deployment     ServerlessRestApiDeploymenta8   -
                                                73fedfca
CREATE_IN_PROGRESS              AWS::Lambda::Permission         BedrockAssistantFunctionSlack   -
                                                PermissionProd
CREATE_IN_PROGRESS              AWS::Lambda::Permission         BedrockAssistantFunctionSlack   Resource creation Initiated
                                                PermissionProd
CREATE_COMPLETE                 AWS::Lambda::Permission         BedrockAssistantFunctionSlack   -
                                                PermissionProd
CREATE_IN_PROGRESS              AWS::ApiGateway::Deployment     ServerlessRestApiDeploymenta8   Resource creation Initiated
                                                73fedfca
CREATE_COMPLETE                 AWS::ApiGateway::Deployment     ServerlessRestApiDeploymenta8   -
                                                73fedfca
CREATE_COMPLETE                 AWS::Lambda::Function           BedrockAssistantFunction        -
CREATE_IN_PROGRESS              AWS::ApiGateway::Stage          ServerlessRestApiProdStage      -
CREATE_IN_PROGRESS              AWS::ApiGateway::Stage          ServerlessRestApiProdStage      Resource creation Initiated
CREATE_COMPLETE                 AWS::ApiGateway::Stage          ServerlessRestApiProdStage      -
CREATE_COMPLETE                 AWS::Logs::LogGroup             BacklogAssitantLogGroup         -
CREATE_COMPLETE                 AWS::CloudFormation::Stack      bedrock-slack-backlog-rag-app   -
-----------------------------------------------------------------------------------------------------------------------------

CloudFormation outputs from deployed stack
-----------------------------------------------------------------------------------------------------------------------------
Outputs
-----------------------------------------------------------------------------------------------------------------------------
Key                 BedrockDataSourceId
Description         -
Value               xxxxxxxxxxxx|zzzzzzzzzzzz

Key                 BedrockAssistantApi
Description         The URL of Slack Event Subscriptions
Value               https://xxxxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Prod/slack/events

Key                 BedrockKnowledgeBaseId
Description         -
Value               xxxxxxxxxxxx

Key                 BedrockAssistantFunctionIamRole
Description         Implicit IAM Role created for Bedrock Assistant function
Value               arn:aws:iam::xxxxxxxxxxxx:role/bedrock-slack-backlog-rag-app-lambda-role

Key                 BedrockAssistantFunction
Description         Bedrock Assistant Lambda Function ARN
Value               arn:aws:lambda:us-east-1:xxxxxxxxxxxx:function:bedrock-slack-backlog-rag-BedrockAssistantFunction-xxxxxxxxxxxx
-----------------------------------------------------------------------------------------------------------------------------


Successfully created/updated stack - bedrock-slack-backlog-rag-app in us-east-1

>>> elapsed time 3m2s

データソースとベクトルデータストアの同期

knowledge baseとData Storeは作成されましたが、まだS3バケットのデータはPineconeに登録されていません。作成したknowledge baseのData Sourceの項目に移動し、Syncボタンをクリックします。S3バケット内のデータに追加/更新があった場合も同様にSyncを実行します。

PineconeとSync

Statusが Ready となれば完了です。PineconeのIndex画面にアクセスすると、このようにデータが登録されていることが分かります。

Pineconeに構築されたインデックス

Slackチャットボットの構築など

あとは、以前の記事を参考にSlackチャットボットを構築するなどしRAGを活用できます。

まとめ

AWS SAMのtemplate.yamlを使用することで、いままで手作業で構築していたKnowledge Bases for Amazon Bedrockの環境構築をコード化できました。Amazon Bedrockはつぎつぎと新しいアップデートがリリースされているため、AWS CloudFormationの対応が追いついていない部分もありますが、機能を試すための環境がコマンド一発で構築できるのはとても有用です。

Amazon Bedrockは画面構成もたびたび変化しているので構築手順のスクリーンショットを作成する手間がかかっていたのですが、コード化することでずいぶん省力化できました。

2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0