More than 1 year has passed since last update.

GlobeeAdvent Calendar 2023

@LyW

LLAMAーINDEXの入門

Posted at 2023-12-19

LLAMAーINDEX実験記録＆マニュアル

LLAMA-INDEXは、公式ドキュメントに記載がない、あるいは機能アップデートによって、GITHUBのみに記載するコードが多数あります。このライブラリーの使い方について、ChatGPTもわからないですので、実験が苦労しました。

故に、このマニュアルは、基本的なLLAMA-INDEXの使い方から、高度の使用まで紹介させていただきたいと思います。

基本紹介

LLAMAーINDEXは、AIのメモリとして設計されています。自身は言語機能がなく、ChatGPTやLLAMA2などのAPIが必要です。
基本デザインは、Promptの中に全ての知識を書き切れない場合、ベクトルデータセットを使用し、素早くPromptに関係あるデータを抽出して、Promptの中に入れます。
例えば、辞書データの中に特定の単語の使い方を抽出することなどができます。
ただし、それは万能かつ簡単ではありません。実は、データの用意方法、読み方、MetaDataなどをうまく設定しないと、デザイン上の機能を実現できません。
これから、簡単な例から説明させていただきます。

簡単な例

前提：映画.jsonがdataset/に収納されています。

映画.jsonのデータ例：

"movie_list": [
        {
            "タイトル": "IRON MAN",
            "邦題": "アイアンマン",
            "公開日": 2008,
            "ストーリー": "アフガニスタンで自社兵器のデモ実験に参加したトニー・スタークは、テロ組織に襲われ拉致されてしまう。胸に深い傷を負い捕虜となった彼は、組織のために最強兵器の開発を強制される。トニーは装着することで、圧倒的な破壊力とパワーを発揮できる戦闘用パワードスーツを敵の目を盗み開発。敵地からの脱出に成功するが、奇跡的に生還したトニーは、ある事実を知り愕然とする・・・。自らが社長を務める会社が開発した兵器がテロ組織に使用されていたのだ。トニーはその償いをすべく、テロ撲滅に命を捧げることを決断。最先端の技術を駆使し、新たなパワードスーツの開発に着手する。",
            "人気ランキング": "5位",
            "ジャンル": [
                "アクション"
            ],
            "難易度": "普通",
            "長さ": "7560秒",
            "特徴": [
                "ビジネス英会話が学べる",
                "フレーズ学習機能が利用できる作品",
                "TOEIC730点以上を目指す方におすすめ"
            ]
        },
...

まずは、公式Docには以下のコードを使用し、データセットの内容を検索する機能を実装しています。

from llama_index import VectorStoreIndex, SimpleDirectoryReader
# 特定のフォルダーの中に全てのデータをロード
documents = SimpleDirectoryReader("dataset").load_data()
# ベクトル化
index = VectorStoreIndex.from_documents(documents)
#　Queryエンジンを初期化
query_engine = index.as_query_engine()
response = query_engine.query("おすすめの映画がありますか？")
print(response)
# アウトロー、チャーリーズ・エンジェル フルスロットル、ヒックとドラゴン、チャーリーズ・エンジェル、ゴースト／ニューヨークの幻の映画がおすすめです。

では、ChatGPTが一体何をしているのか？
そのため、内部のPromptを覗くことが必要。
まず、Promptを美しくプリントアウトする関数を用意

from IPython.display import Markdown, display
# define prompt viewing function
def display_prompt_dict(prompts_dict):
    for k, p in prompts_dict.items():
        text_md = f"**Prompt Key**: {k}<br>" f"**Text:** <br>"
        display(Markdown(text_md))
        print(p.get_template())
        display(Markdown("<br><br>"))

以下のコードを実行する

display_prompt_dict(query_engine.get_prompts())

"""
Prompt Key: response_synthesizer:text_qa_template
Text:

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 

Prompt Key: response_synthesizer:refine_template
Text:

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer:
"""

二種類のPromptが内蔵されています。response_synthesizer:text_qa_templateとresponse_synthesizer:refine_template
前者は、Q&Aの時に使われます。後者は、リフラインをする時に使われます。（今回は使われていません）
もちろん、Promptの変更ができます。公式ドックには明白的に記載することがないですが、以下の手順で変更が可能です。

from llama_index.prompts import PromptTemplate

new_query_tmpl_str = (
    "As an efficient assistant for Globee, a company specializing in English education, your role is to leverage the context as a primary information source to address user inquiries.\n" 
    "Your responses should be in Japanese, reflecting the company's focus. Ensure that your answers are both concise and user-friendly. \n"
    "Your goal is to offer supportive and informative assistance, tailored to the user's needs based on the contents of the uploaded document.\n"
    "Please follow the rules, it is very important to my career.\n"
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
new_qa_tmpl = PromptTemplate(new_query_tmpl_str)
query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": new_qa_tmpl}
)

開発途中のバージョンには、一時的にQAエンジンをカスタムPromptを入れて使われています。

挑戦

簡単な例が一見機能していますが、多くの挑戦があります。

挑戦の後、具体的な解決策を紹介します。

破片化

データセット化する時に、LLAMAーINDEXはデータをChunkにして保存しています。その際に、データを不完全に保存し、データが破片になっています。
例えば、上記のデータセット構築の中に、こういうChunkがあります。

DEBUG:llama_index.node_parser.node_utils:> Adding chunk: "特徴": [
                "発音が聞き取りやすく、音読学習におすすめ",...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: 彼らとの仕事にスリルを覚え、才能を活かしてきたベイビーだったが、恋人デボラ（リリー・ジェームズ...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: 「一つになれば、俺たちはなんだってできる」とシンビオートはエディの体を蝕み、このまま自分の乗り...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: しかし、そんな彼の前に地球上では下着のモデルとして活躍している超セクシーエイリアンのサーリーナ...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: そんなある日、何者かにより時空が歪められる大事故が起こる。その天地を揺るがす激しい衝撃により歪...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: "長さ": "7764秒",
            "特徴": [
            ...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: 果たして彼らは、生きて現実世界に帰ることができるのか！",
            "人気ラン...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: IT界の伝説ナップスター創設者のショーン・パーカーとの出会い、そして、社会現象を巻き起こすほど...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: 携帯電話会社のCEOでNY市長候補のスタックスは選挙キャンペーン中、車にはねられそうになった少...
DEBUG:llama_index.node_parser.node_utils:> Adding chunk: "女性が輝く作品",

理想的には、JSONのフォーマットでChunk化したいですね。ちなみに長さでSplitだけではなく、初期値であればSentence基準になっています。

人格化（System Prompt）できない

例のコードの中に、query_engine.query("おすすめの映画がありますか？")を使ってChatGPTとイントラクションしています。
しかし、こうするとAgentの人格を設定できません。
初期の実験の中に、query_strの部分を強引に人格設定を入力し、設定が一旦効きました。しかし、QAの性質上、query_strを全て使ってデータセットを検索しますので、強引な人格設定はノイズになりました。
要するに、他の解決ルートが入ります。

Chatではない

履歴を維持していないし、履歴を入れるところもない。強引に入れると、またデータセットの検索の支障になります。

METAを反映できない

仮に破片などの問題が解決されたとしても、検索は難航
例えば、こちらの映画のデータはちゃんと一つのノードになっているとします：

{
            "タイトル": "CHEF",
            "邦題": "シェフ 三ツ星フードトラック始めました",
            "公開日": 2014,
            "ストーリー": "「アイアンマン」シリーズ監督のジョン・ファヴローが製作・監督・脚本・主演4役を務め、フードトラックの移動販売をはじめた一流レストランの元料理長のアメリカ横断の旅を描いたハートフルコメディ。ロサンゼルスの有名レストランで料理長を務めるカールは、口うるさいオーナーや自分の料理を酷評する評論家とケンカして店を辞めてしまう。心配する元妻イネズの提案で、息子パーシーを連れて故郷のマイアミを訪れたカールは、そこで食べたキューバサンドイッチの美味しさに驚き、フードトラックでサンドイッチの移動販売をすることを思いつく。カールはイネズやパーシー、仲間たちの協力を得て、マイアミからニューオリンズ、ロサンゼルスへと旅を続けながら、本当に大切なものを見つけていく。",
            "人気ランキング": "18位",
            "ジャンル": [
                "コメディ",
                "ドラマ"
            ],
            "難易度": "難しい",
            "長さ": "6900秒",
            "特徴": [
                "海外の文化が学べる",
                "音楽を楽しめる作品",
                "英語の方言が学べる"
            ]
        }

もしユーザーは音楽を楽しめる作品を検索する際に、理想的には、特徴の中に"音楽を楽しめる作品"がありますのでヒットすべきですが、そうではありません。
なぜなら、JSON全体がベクトル化されますので、このJSON全体の内容に似ている質問のみ、検索ができます。
つまり、"特徴"、"長さ"など、別々でベクトル化する方法が要ります。
しかし、そうする場合、"特徴"の"音楽を楽しめる作品"がヒットされていても、映画の名前やストーリーなどは分かりませんので、新たな問題になっています。
これらの問題は、METADATAをデザインすることによって解決されます。

データセットの混雑

映画のデータのみであればまだまだ問題が少ないですが、もし教材のデータも入れると、さらに大問題が発生します。
それは、映画のストーリーや教材の詳細の中に、質問に近い内容があって、ChatGPTが両方を混ぜて答えてしまうことです。
例えば、TOEIC５５０に適する映画を聞くと、教材の答えも出てきます。うん？ちゃんと映画で聞いたので、教材はあり得ないじゃん？と思っている人もいますが、そうではありません。教材と映画だけの違いのであれば、質問に十分近い可能性がありますので、ChatGPTの修正能力もここはあんまり発揮されていません。
なので、教材のことを聞くと、教材だけのデータセットを使うことが理想です。

ローディングの遅延

簡単な例のやり方であれば、ローディング時間が何分かかって、DEMOのWEB WORKERが起動する際に何分かかって、非常に遅いです。

CHATGPTの詳しい設定がない

簡単な例のやり方であれば、CHATGPTの設定ができません。

複雑な例：解決法

上記の全ての挑戦を解決することができましたので、その例を紹介したいと思います。

破片化とMETAの解決方法：カスタムロード

JSONのロード方法を手動で設定することが可能。その際に、METAデータを全てのSentenceに付与し、知識の完備性を維持します。
該当するコードは以下です。
- このコードは全てのデータを一緒にロードしました。
- ChunkSizeが512、Sentence Splitterは保留されています。
- 映画ならば、主体内容は”ストーリー”、教材は”詳細”、ABCEEDデータセットは全文分割。
- 最終的には、IngestionPipelineによってNodeにロードします。

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=512, chunk_overlap=0),
    ]
)

with open(os.path.join(dir_path, "database", "映画.json")) as f:
    film_json = json.load(f)
# add a new field to each film in the json. type = "film"
for i in range(len(film_json["movie_list"])):
    film_json["movie_list"][i]["type"] = "film"

document_list = []
for film in film_json["movie_list"]:
    document = Document(
        text=film["ストーリー"],
        metadata={key: film[key] for key in film if key != "ストーリー"},
    )
    document_list.append(document)

with open(os.path.join(dir_path, "database", "教材.json")) as f:
    book_json = json.load(f)
# add a new field to each book in the json. type = "book"
for i in range(len(book_json["book_list"])):
    book_json["book_list"][i]["type"] = "book"

for book in book_json["book_list"]:
    if book["詳細"] is not None:
        document = Document(
            text=book["詳細"],
            metadata={key: book[key] for key in book if key != "詳細"},
        )
        document_list.append(document)

with open(os.path.join(dir_path, "database", "abceed君_textv2.txt")) as f:
    data_str = f.read()
    document = Document(
        text=data_str,
        metadata={"type": "abceed"},
    )
    document_list.append(document)
nodes = pipeline.run(documents=document_list)

Chat機能、ChatGPTの設定、と人格化を解決するため：CondensePlusContextChatEngine

これは先月アップデートされたばっかりの機能ですが、能力が半端ないです。
簡単に言うと、ユーザーの質問を過去の履歴に基づいて書き直して、それを使ってデータセットを検索できるかつ自動で履歴を保管できるChatEngine。
コードはこちらです。

chat_engine = CondensePlusContextChatEngine.from_defaults(
    index.as_retriever(
        similarity_top_k=4, filters=filters
    ),
    service_context=service_context,
    system_prompt="As an efficient assistant for Globee, a company specializing in English education, your role is to leverage the context as a primary information source to address user inquiries.\n"
    "Your responses should be in succint Japanese. reflecting the company's focus. Ensure that your answers are both concise and user-friendly. \n"
    "Your goal is to offer supportive and informative assistance, tailored to the user's needs based on the contents of the uploaded document.\n"
    "When multiple contexts are given, it is not true that all contexts are relevant to the user's inquiry. You should only use the relevant contexts to answer the user's question.\n"
    "Focus on the Query ONLY.\n"
    "Please follow the rules and make the answer short (max 3-4 sentences). It is very important to my career.\n",
    chat_history=chat_history,
    node_postprocessors=[postprocessor],
    verbose=True,
)

similarity_top_kはデータセットは幾つの検索結果を返すのを設定できます。
filtersは、質問に応じて、いらないデータセットを検索する際に排除できるものです。
service_contextは、ChatGPTを設定する方法です。別途こちらのコードになります。

llm = OpenAI(temperature=0, model="gpt-4-1106-preview", max_tokens=1024)
service_context = ServiceContext.from_defaults(llm=llm)

chat_historyは、ChatMessageのリストになっています。普段は空欄で自動管理するんですが、DEMOはgunicornであるため、REDISで動的に履歴を抽出することが要りますので、別途設定とカスタム必要がある。該当するコードはこちらです。
- user_idは会話するごとに発行します

REDIS_HOST = os.getenv("REDIS_URL", "localhost")
REDIS_PORT = os.getenv("REDIS_PORT", 6379)
redis_client = redis.StrictRedis(host=REDIS_HOST, port=REDIS_PORT, db=0)
from llama_index.llms import ChatMessage
# Revised Redis functions:
def get_conversation_history(user_id):
    # Fetch all elements from the Redis list
    conversation = [json.loads(item) for item in redis_client.lrange(user_id, 0, -1)]
    messages = [
        ChatMessage(role=message["role"], content=message["content"])
        for message in conversation
    ]
    return messages

def update_conversation_history(user_id, message):
    # Push the new message to the end of the Redis list
    redis_client.rpush(user_id, message.json())

node_postprocessorsは、データセットの中に、ノードを抽出した後の後処理です。今回は信憑性が低いものを削除するフィルターを使っています。設定のコードはこちらです。

from llama_index.postprocessor import SimilarityPostprocessor
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.7)

データセットの混雑の解決法：質問の前処理

考え方は簡単です。
ChatGPTを使って質問を分類します。映画の際に映画に、教材の際に教材に、ABCEED類に関してABCEEDに、そうではない場合雑談に分類します。
雑談の場合のみ、データセットをスキップし、単純なChatBotを使います。
コードはこちらです

system_categorizer_prompt = f"""
As a categorizer, your primary function is to classify inputs into one of three predetermined categories based solely on the content of the input. Your role is akin to an API, which means you should not engage in or initiate any conversation beyond categorization. Upon receiving a phrase or sentence, your task is to determine which of the following categories it best fits into:
The query could be in Japanese.

Film/Movie
Book/Textbook
App Feature/Functionality
Normal Talk (not related to English learning)

If an input pertains to more than one category, please list each applicable category number, separated by commas.

For instance:
===
Input: 'Can you suggest some engaging books to help enhance my English listening skills?'
Your Response: 2
===
Input: 'How many films are in our collection?'
Your Response: 1
===
Input: 'Are there any recommended movies or additional app features that could assist me?'
Your Response: 1, 3
===
Input: 'How to ride a bike?'
Your Response: 4

Remember, your responses should only include the relevant category number(s).
"""

def cateorize(query_content):
    message_list = [
        {"role": "user", "content": system_categorizer_prompt},
        {"role": "user", "content": query_content},
    ]
    response = send_to_chatgpt(message_list, i=0, temp=0, stream=False, max_tokens=256)
    response_list = str(response).split(",")
    type_dict = {1: "film", 2: "book", 3: "abceed", 4: "chat"}
    for i in range(len(response_list)):
        if response_list[i] not in {"1", "2", "3"}:
            response_list[i] = "4"
    return [type_dict[int(response)] for response in response_list]

そして、質問を処理する際に、typeは雑談のみ時に、シンプルなChatEngineに切り替える
ここも罠があって、LLAMAーINDEXは単純なChat Engineがありません。そのため、Tools Engineを作って（でもToolsは空欄）という異例のやり方で実現できました。

data = request.json
query_content = data["message"]
type = cateorize(query_content)
chat_history = get_conversation_history(data["user_id"])
if ["chat"] == type:
    chat_engine = OpenAIAgent.from_tools(
        tools=[],
        llm=OpenAI(temperature=0, model="gpt-4"),
        chat_history=chat_history,
    )
else:
...

もし教材、映画、ABCEEDなどの分野であれば、該当Filterを設定します。
- このフィルターの設定はデフォルトがAND。しかし、ORが欲しいので、結構苦労しました。バージョン0.9.6の中になくて、最新の0.9.13にアップデートする必要があります。
- かつ、ドキュメントも無しなので、そこのコードに辿り着いたのは、半分運でした。
コードはこちらです。

from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters

from llama_index.vector_stores.types import FilterCondition
def return_filter(keywords):
    return MetadataFilters(
        filters=[ExactMatchFilter(key="type", value=keyword) for keyword in keywords], condition=FilterCondition.OR
    )

以上、今回の実装のログとLLAMA-INDEXのマニュアルでした。

コードの全貌はこちら：

from unittest import skip
import openai
from time import sleep
import redis
from llama_index.llms import OpenAI
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.postprocessor.cohere_rerank import CohereRerank
from llama_index import Document
from llama_index.text_splitter import SentenceSplitter
from llama_index.ingestion import IngestionPipeline
import uuid
import logging

import json
from llama_index.agent import OpenAIAgent
from llama_index.llms import OpenAI
from llama_index.prompts import PromptTemplate
from llama_index.chat_engine.condense_plus_context import CondensePlusContextChatEngine
from llama_index.postprocessor import SimilarityPostprocessor
from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters
from llama_index.vector_stores.types import FilterCondition

from llama_index.llms import ChatMessage, MessageRole
logging.basicConfig(level=logging.DEBUG)

dir_path = os.path.dirname(os.path.realpath(__file__))
# create the pipeline with transformations
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=512, chunk_overlap=0),
    ]
)

with open(os.path.join(dir_path, "database", "映画.json")) as f:
    film_json = json.load(f)
# add a new field to each film in the json. type = "film"
for i in range(len(film_json["movie_list"])):
    film_json["movie_list"][i]["type"] = "film"

document_list = []
for film in film_json["movie_list"]:
    document = Document(
        text=film["ストーリー"],
        metadata={key: film[key] for key in film if key != "ストーリー"},
    )
    document_list.append(document)

with open(os.path.join(dir_path, "database", "教材.json")) as f:
    book_json = json.load(f)
# add a new field to each book in the json. type = "book"
for i in range(len(book_json["book_list"])):
    book_json["book_list"][i]["type"] = "book"

for book in book_json["book_list"]:
    if book["詳細"] is not None:
        document = Document(
            text=book["詳細"],
            metadata={key: book[key] for key in book if key != "詳細"},
        )
        document_list.append(document)

with open(os.path.join(dir_path, "database", "abceed君_textv2.txt")) as f:
    data_str = f.read()
    document = Document(
        text=data_str,
        metadata={"type": "abceed"},
    )
    document_list.append(document)
nodes = pipeline.run(documents=document_list)

def return_filter(keywords):
    return MetadataFilters(
        filters=[ExactMatchFilter(key="type", value=keyword) for keyword in keywords], condition=FilterCondition.OR
    )

llm = OpenAI(temperature=0, model="gpt-4-1106-preview", max_tokens=1024)
service_context = ServiceContext.from_defaults(llm=llm)

index = VectorStoreIndex(nodes, service_context=service_context)
index.storage_context.persist(persist_dir=os.path.join(dir_path, "index_buffer"))

# ==＝＝＝＝ reading from storage ==＝＝＝＝＝
from llama_index import StorageContext, load_index_from_storage

# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir=os.path.join(dir_path, "index_buffer"))

# load index
index = load_index_from_storage(storage_context)

# ==＝＝＝＝ reading from storage ==＝＝＝＝＝
api_key = "your api key"
cohere_rerank = CohereRerank(api_key=api_key, top_n=5)
postprocessor = SimilarityPostprocessor(similarity_cutoff=0.7)

from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters

filters = MetadataFilters(filters=[ExactMatchFilter(key="type", value="fruit")])
def send_to_chatgpt(message_list, i=0, temp=0, stream=False, max_tokens=256):
    if i > 2:
        raise Exception("Maximum retry limit exceeded.")
    try:
        # Log the message_list content
        logging.info(f"Sending message_list to ChatGPT: {message_list}")
        response = client.chat.completions.create(
            model="gpt-4",
            messages=message_list,
            temperature=temp,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0,
            stream=stream,
            max_tokens=max_tokens,
        )
        if stream:
            return response
        return response.choices[0].message.content.strip()
    except openai.APIError as e:
        # If the error message indicates a token limit issue, raise a custom exception
        if "too many tokens" in str(e):
            logging.error("Error: The input text exceeds the maximum token limit.")
            raise Exception("The input text exceeds the maximum token limit.")
        else:
            # Log other errors and retry
            logging.error(f"An error occurred: {e}")
            logging.info("Retrying...")
            sleep(1)
            return send_to_chatgpt(message_list, i + 1, temp, stream, max_tokens)

system_categorizer_prompt = f"""
As a categorizer, your primary function is to classify inputs into one of three predetermined categories based solely on the content of the input. Your role is akin to an API, which means you should not engage in or initiate any conversation beyond categorization. Upon receiving a phrase or sentence, your task is to determine which of the following categories it best fits into:
The query could be in Japanese.

Film/Movie
Book/Textbook
App Feature/Functionality
Normal Talk (not related to English learning)

If an input pertains to more than one category, please list each applicable category number, separated by commas.

For instance:
===
Input: 'Can you suggest some engaging books to help enhance my English listening skills?'
Your Response: 2
===
Input: 'How many films are in our collection?'
Your Response: 1
===
Input: 'Are there any recommended movies or additional app features that could assist me?'
Your Response: 1, 3
===
Input: 'How to ride a bike?'
Your Response: 4

Remember, your responses should only include the relevant category number(s).
"""

def cateorize(query_content):
    message_list = [
        {"role": "user", "content": system_categorizer_prompt},
        {"role": "user", "content": query_content},
    ]
    response = send_to_chatgpt(message_list, i=0, temp=0, stream=False, max_tokens=256)
    response_list = str(response).split(",")
    type_dict = {1: "film", 2: "book", 3: "abceed", 4: "chat"}
    for i in range(len(response_list)):
        if response_list[i] not in {"1", "2", "3"}:
            response_list[i] = "4"
    return [type_dict[int(response)] for response in response_list]

# Revised Redis functions:
def get_conversation_history(user_id):
    # Fetch all elements from the Redis list
    conversation = [json.loads(item) for item in redis_client.lrange(user_id, 0, -1)]
    messages = [
        ChatMessage(role=message["role"], content=message["content"])
        for message in conversation
    ]
    return messages

def update_conversation_history(user_id, message):
    # Push the new message to the end of the Redis list
    redis_client.rpush(user_id, message.json())

@assistant_api.route("/ai_response_llama", methods=["POST"])
def ai_response_llama():
    data = request.json
    query_content = data["message"]
    type = cateorize(query_content)
    chat_history = get_conversation_history(data["user_id"])
    print("type======================================",type)
    if ["chat"] == type:
        chat_engine = OpenAIAgent.from_tools(
            tools=[],
            llm=OpenAI(temperature=0, model="gpt-4"),
            chat_history=chat_history,
        )
    else:
        # get types that are not in filters
        filters = return_filter(type)
        chat_engine = CondensePlusContextChatEngine.from_defaults(
            index.as_retriever(
                similarity_top_k=4, filters=filters
            ),
            service_context=service_context,
            system_prompt="As an efficient assistant for Globee, a company specializing in English education, your role is to leverage the context as a primary information source to address user inquiries.\n"
            "Your responses should be in succint Japanese. reflecting the company's focus. Ensure that your answers are both concise and user-friendly. \n"
            "Your goal is to offer supportive and informative assistance, tailored to the user's needs based on the contents of the uploaded document.\n"
            "When multiple contexts are given, it is not true that all contexts are relevant to the user's inquiry. You should only use the relevant contexts to answer the user's question.\n"
            "Focus on the Query ONLY.\n"
            "Please follow the rules and make the answer short (max 3-4 sentences). It is very important to my career.\n",
            chat_history=chat_history,
            node_postprocessors=[postprocessor],
            verbose=True,
        )

    print(chat_history)
    response = chat_engine.chat(query_content)
    # update chat history by appending to the end
    update_conversation_history(
        data["user_id"], ChatMessage(role="user", content=query_content)
    )
    update_conversation_history(
        data["user_id"], ChatMessage(role="assistant", content=response.response)
    )
    return jsonify({"result": response.response})

発展が非常に早いライブラリーなので、今後の更新も期待します。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up