AWSでAIチャットボット

Last updated at 2024-09-12Posted at 2024-09-06

AIチャットボットを立ち上げ、運用する

(ネット上に散在する情報を繋ぎ合わせて、とりあえず動くようにしてみた)

AWS上でAIチャットボットを動作させるための手順。

先ず、以下の記事を参考にAWSのサーバーを立ち上げる。

このAWSサーバーにアクセスし、AIチャットボットの
環境を設定する。

wsl2のubuntuウィンドウ内で作業する。
aws上のECのユーザー名がubuntu、AWSのアクセスキーが作業ディレクトリにあるとし、そのファイル名がxxxx.pemとすると、

AWSにログイン

ssh -i "xxxx.pem" ubuntu@<awsドメイン名>

バーチャル環境

以下のコマンドで環境を設定

source api/bin/activate

これによって、コマンドラインに左側に環境名が表示される、
（この場合はapi）

(api) ubuntu@ip-xxxxxxxxx

OpenAI APIキーの設定

　ユーザーのホームディレクトリにある、.bashrcの最後に以下のようにOpen_AIのAPIキーを設定する。APIキーはOpen-AIのサイトから取得する。たま、Opan-AIのAPIの使用は有料なので、クレジットカードで、10$以上チャージしておく。

export OPENAI_API_KEY='sk-proj-........................'

ファイルの転送（ロカール <->　リモート）

AWS上でAPIサーバーとして動作するPythonコードをローカルからAWSへ転送するか、AWSのエディタで作成する。

AWSからファイルを受け取る場合

scp -i xxxx.pem ubuntu@<AWSドメイン名>:<ファイル名> .

“.”はカレントディレクトリを表す

AWSへファイルを送る場合

scp -i xxxx.pem test.pdf  ubuntu@<AWSドメイン名>:~/

“~/”は、ホームディレクトリを表す

AIサーバー(pythonコード）の作成と設定

現在、チャットボットのよう用途の場合、'gpt-4o-mini'が高速で格安。

# Flask imports
from flask import Flask, request, jsonify, render_template
# Langchain imports
from langchain_openai.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain, LLMChain
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain.memory import ConversationSummaryMemory
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
# Other imports
import os
#from dotenv import load_dotenv
from langchain.prompts import PromptTemplate
from langchain.chains.question_answering import load_qa_chain
import json
from settings.setting import API_KEY
# My Retrieval
from  langchain.chains.qa_with_sources.base import BaseQAWithSourcesChain
#from my_retrieval import MyRetrievalQAWithSourcesChain, my_summarize_content

def load_pdf_document(file_path):
        loader = PyPDFLoader(file_path)
        return loader.load()


def split_documents_into_chunks(documents, chunk_size=1000, chunk_overlap=0):
        text_splitter = CharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
        return text_splitter.split_documents(documents)

app = Flask(__name__)

template_qg = """
次の会話に対しフォローアップの質問があるので、フォローアップの質問を独立した質問に言い換えなさい。
        
チャットの履歴:
{chat_history}
                       
フォローアップの質問:
{question}
                                          
言い換えられた独立した質問:"""

prompt_qg = PromptTemplate(
          template=template_qg,
          input_variables=["chat_history", "question"],
          output_parser=None,
          partial_variables={},
          template_format='f-string',
          validate_template=True,
)

prompt_template_qa = """あなたの名前は XXXX　YYYYのマスコットキャラクター.　ユーザーからの入力があった際には、必ず知識源ファイルを丁寧に調べてXXXXの言葉として返答します。返答は日本語で行います。話口調で、親し気な口調で話しかけます。「知識源」という言葉は使わないで、自分のことは「僕」と表現します
写真を見せるときはリンクを次のように提示します。 <img src ="link address">

{context}

Question: {question} Answer in Japanese:"""

prompt_qa = PromptTemplate(
       template=prompt_template_qa, 
       input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": prompt_qa}


# ChatOpenAIモデルの設定
LLM = ChatOpenAI(
            #model_name='gpt-3.5-turbo',
            model_name='gpt-4o-mini',
            temperature=0,
            api_key = API_KEY
)

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import JSONLoader

data = load_pdf_document("pdf_data/knowledge.pdf")
#docs = split_documents_into_chunks(data, chunk_size=1000, chunk_overlap=0)
text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=100,
        length_function=len,
        separators=["\n\n", "\n", "。", " ", "、", ""],
      )
docs = text_splitter.split_documents(data)
#docs = split_documents_into_chunks(data)
vectorstore = Chroma.from_documents(documents=docs, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()
question_generator = LLMChain(llm=LLM, prompt=prompt_qg)
doc_chain = load_qa_chain(llm=LLM, chain_type="stuff", prompt=prompt_qa)
#
qa = ConversationalRetrievalChain(
                retriever=retriever,
                question_generator=question_generator,
                combine_docs_chain=doc_chain,
         )
chat_history = []

# Flask routes
@app.route('/')
def index():
    return render_template('index.html')

@app.route('/ask', methods=['POST'])
def ask(): 
    body = request.get_data(as_text=True)
    data = json.loads(body)
    app.logger.info("Request body: " + body)
    prompt = data["question"] 
    result = qa.invoke({"question": prompt, "chat_history": chat_history})['answer']
    print("Q ",prompt)
    print("A ",result)
    chat_history.append((prompt, result))
    dummy = chat_history
    if len(chat_history) > 5:
            chat_history.pop(0)
    #print(chat_history)
    #print("### ",  chat_history)
    return jsonify({"answer": result})
# Run application
if __name__ == '__main__':
#
    app.run()

会話が表示されるWebの設定

AIチャットボットとの会話が表示されWebページの出力の設定。
このファイルはtemplatesフォルダに、index.htmlとしてセーブ
表示されるキャラクターのイメージ chara.pngをstaticの下の
imageの下に置く

<!DOCTYPE html>
<html>
<head>
    <title>LangChain ChatBot</title>
    <script src="https://code.jquery.com/jquery-3.6.0.min.js"></script>
    <style>
        body, html {
            height: 100%;
            margin: 0;
            display: flex;
            flex-direction: column;
        }
        #chat {
            max-width: 100%;
            flex-grow: 1;
            overflow: auto;
        }
        .question,
        .answer {
            max-width: 80%;
            padding: 10px;
            margin: 5px;
            border-radius: 10px;
        }
        .question {
            background-color: #f1f1f1;
            float: right;
            clear: both;
        }
        .answer {
            background-color: #e6ffe6;
            float: left;
            clear: both;
        }
        .error {
            color: red;
            font-weight: bold;
        }
        #input-area {
            text-align: center;
            padding: 10px;
            background-color: #93e9e0;
        }

         /* ここから追加 */
         #input-area input[type="text"] {
            font-size: 1.2em; /* 文字の大きさを1.5倍に */
            padding: 15px 10px; /* 上下のパディングを増やして入力エリアを高く */
            width: 50%; /* 幅を2/3に設定 */
        }
        #input-area button {
        cursor: pointer; /* ホバー時にカーソルを指に変更 */
        height: 100%; 
        padding: 15px 10px; /* 上下のパディングを増やして入力エリアを高く */
        background-color: #93e9e0;
        }
        #page-title {
        display: flex;
        justify-content: center; /* 水平方向の中央揃え */
        /*align-items: center;*/ /* 垂直方向の中央揃え */
        height: 5vh; /* ビューポートの高さを100%に設定 */
        text-align: center; /* テキストを中央揃え */
        background-color: #93e9e0;
        }
        #title-icon {
        height: 2.8em; /* 画像の高さをh1のテキストの高さに合わせる */
        width: 2.8em; /* 画像の横幅もh1のテキストの高さに合わせる */
        vertical-align: left; /* 画像をテキストの中央に配置 */
        margin-right: 10px; /* 画像とテキストの間に余白を追加 */
}

h1 {
    margin: 0; /* デフォルトのマージンを取り除く */
}

        /* ここまで追加 */
    </style>
</head>
<body>
    <!-- ここから追加 -->
    <div id="page-title">
    <img  id="title-icon" src="/static/images/chara.png" alt="Garlic">
    <h1>キャラクターの名前など</h1> <!-- タイトルのテキスト -->
    </div>
    <!-- ここまで追加 -->
    <div id="chat">
        <!-- Chat history will go here -->
    </div>
    <div id="input-area">
        <input type="text" id="question" placeholder="Type your question here..." minlength="2" maxlength="200" size="100">
        <button id="submit">Submit</button>
    </div>
    <script>
        $(document).ready(function() {
            $("#submit").click(function() {
                const question = $("#question").val();
                $("#chat").append("<p class='question'>Q: " + question + "</p>");
                $('#chat').scrollTop($('#chat')[0].scrollHeight); // Scroll to the bottom of #chat

                // Clear the text input field
                $("#question").val('');
                
                $.ajax({
                    url: '/ask',
                    type: 'POST',
                    contentType: 'application/json',
                    data: JSON.stringify({ "question": question }),
                    success: function(data) {
                        const answer = data.answer;
                        $("#chat").append("<p class='answer'>A: " + answer + "</p>");
                        $('#chat').scrollTop($('#chat')[0].scrollHeight); // Scroll to the bottom of #chat

                        // Re-enable the submit button
                        $("#submit").prop("disabled", false);
                    },
                    error: function() {
                        $("#chat").append("<p class='answer error'>エラーが発生しました。申し訳ありません。もう一度質問してください。</p>");
                        
                    }
                });
            });
        });

        $(document).ready(function() {
    $("#question").keypress(function(event) {
        if (event.which == 13) { // 13はエンターキーのキーコード
            event.preventDefault(); // フォームの自動送信を防ぐ
            $("#submit").click(); // Submitボタンのclickイベントをトリガする
        }
    });

    
});
    </script>
</body>
</html>

知識源の準備

　知識源はそのキャラクターのSNS上での会話やブログやWEBページの記事をword上にコピペする。このwordファイルをpdf化し、フォルダpdf_dataにknowledge.pdfとして保存

起動

　以下のコマンドでサーバーを起動する。pythonコードが、CharaAI.pyとすると、

uwsgi --socket=/tmp/uwsgi.sock --wsgi-file=CharaAI.py  --callable=app --chmod-socket=666

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up