0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

Momento Vector Index 触ってみる

Posted at

表題通りの内容です。ほぼドキュメント通りで恐縮ですが、備忘録として残します。

手順

  • 管理コンソールからAPI_KEYを取得

    • リージョン : us-west-2
    • Key Type : Super User

image.png

ここからはPython

モジュールのインストール

pip install momento
pip install openai
pip install python-dotenv
python
from momento import PreviewVectorIndexClient, VectorIndexConfigurations, CredentialProvider
from momento.requests.vector_index import Item
from momento.requests.vector_index import (ALL_METADATA, Item, SimilarityMetric)
from momento.responses.vector_index import (UpsertItemBatch, CreateIndex, DeleteIndex, ListIndexes, Search)
import openai
import os
import pandas as pd
from dotenv import load_dotenv

インデックスの作成

python
load_dotenv()

#今回はAzureOpenaiの埋め込みAPIを使う
API_KEY = os.environ['API_KEY']
RESOURCE_ENDPOINT = os.environ['RESOURCE_ENDPOINT']

openai.api_type = "azure"
openai.api_key = API_KEY
openai.azure_endpoint = RESOURCE_ENDPOINT
openai.api_version = "2023-05-15"

MOMENTO_API_KEY = os.environ['MOMENTO_API_KEY']

#MomentoClient初期化
client = PreviewVectorIndexClient(
        configuration=VectorIndexConfigurations.Default.latest(),
        credential_provider=CredentialProvider.from_string(MOMENTO_API_KEY),
    )

# Momentoにインデックスを作成 (1536次元、コサイン類似度)
index_name = "test_index"
num_dimensions = 1536
similarity_metric = SimilarityMetric.COSINE_SIMILARITY

create_index_response = client.create_index(
    index_name,
    num_dimensions,
    similarity_metric)

if isinstance(create_index_response, CreateIndex.Success):
    print(f"Index with name {index_name!r} successfully created!")
elif isinstance(create_index_response, CreateIndex.IndexAlreadyExists):
    print(f"Index with name {index_name!r} already exists")
elif isinstance(create_index_response, CreateIndex.Error):
    print(f"Error while creating index: {create_index_response.message}")

テスト用のQAデータのQのカラムに対して埋め込みを実行

python
df = pd.read_csv("./csv/test_faq_dataset.csv")
df = df.fillna("")
df.head()
df.info()

#EnbeddingFunction
def get_embedding(text, model="text-embedding-ada-002"):
   text = text.replace("\n", " ")
   res = openai.embeddings.create(input = [text], model=model).data[0].embedding
   return res

df['vectorContent'] = df['Q'].apply(lambda x: get_embedding(x,model="text-embedding-ada-002"))
df.head(3)

元資料は以下からダウンロードして加工

インデックスの一覧を取得

python
list_indexes_response = client.list_indexes()

if isinstance(list_indexes_response, ListIndexes.Success):
    for index in list_indexes_response.indexes:
        print(f"Index name: {index.name}, number of dimensions: {index.num_dimensions}")
elif isinstance(list_indexes_response, ListIndexes.Error):
    print(f"Error while listing indexes: {list_indexes_response.message}")

インデックスにデータを挿入

python
def create_item(row):
    item_id = row['ID']
    metadata = {'Q': row['Q'], 'A': row['A']}
    vector = row['vectorContent']

    return Item(id=str(item_id), vector=vector, metadata=metadata)

# df の各行を Item オブジェクトに変換
items = df.apply(create_item, axis=1).tolist()

# アイテムをバッチサイズ100ずつアップロード
for i in range(0, len(items), 100):
    items_batch = items[i:i + 100]
    index_name = "test_index"
    upsert_response = client.upsert_item_batch(index_name=index_name, items=items_batch)

    if isinstance(upsert_response, UpsertItemBatch.Success):
        print("Successfully upserted items")
    elif isinstance(upsert_response, UpsertItemBatch.Error):
        print(f"Error while adding items to index {index_name!r}: {upsert_response.message}")

検索

python
index_name = "test_index"
query = "出生届けはどうすれば良いですか?"
query_vector = get_embedding(query,model="text-embedding-ada-002") 
top_k = 2

#検索実行
search_response = client.search(index_name,
                                query_vector=query_vector,
                                top_k=top_k,
                                metadata_fields=ALL_METADATA)

if isinstance(search_response, Search.Success):
    print(f"Search succeeded with {len(search_response.hits)} matches")
    for hit in search_response.hits:
        print(f"Item ID: {hit.id}, score: {hit.score}")
        print(f"Metadata: {hit.metadata}")
elif isinstance(search_response, Search.Error):
    print(f"Error while searching on index {index_name}: {search_response.message}")
Search succeeded with 2 matches
Item ID: 1072, score: 0.9489536881446838
Metadata: {'A': '出生届は父または母が届出人となります。届書を...'}
Item ID: 363, score: 0.9417130947113037
Metadata: {'A': '出生届は出生の日から14日以内(初日算入)に届出をしてください。...'}

インデックスの削除

python
index_name = "test_index"

delete_response = client.delete_index(index_name)
if isinstance(delete_response, DeleteIndex.Success):
    print("Successfully deleted index")
elif isinstance(delete_response, DeleteIndex.Error):
    print(f"Error while deleting index {delete_response.message}")

さいごに

初ログインから使い始めるまでがとっても早くてすごく良い!!
これきっかけにMomento色々触っていきたいです🙂

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?