Vertex AI Vector Search のパフォーマンス検証 vol.1

Last updated at 2024-12-15Posted at 2024-12-14

vol.2 があるかはわからない。

検証方法

下記条件の元、query 数を増やしながら近傍探索を実行し応答時間とスループットを確認した。

embedding の次元数: 128
index 内の embedding 数: 約 4,000,000
取得する近傍 embedding 数: 30
index をホストする node 数: 1

※ VPC ネットワークピアリング接続は設定していない。

結果

query 数ごとの応答時間とスループットは以下の通り。
query 数を 100,000 まで増やすと None Stream removed が発生するようになった。
VPC ネットワークピアリング接続を設定したり、index をホストする node の数を増やせばパフォーマンスを改善できる。

query 数	応答時間 [sec.]	スループット [query 数/sec.]
1	0.16	6
10	0.17	58
100	0.23	434
1,000	0.67	1,499
10,000	5.04	1,982
100,000	-	-

使用コード

NUM_TOTAL_QUERIES を変えながら以下のコードを実行した。

import time
import numpy as np
from tqdm import tqdm
from google.cloud import aiplatform_v1


API_ENDPOINT="xxxxxx.xxxxxx.vdb.vertexai.goog"
INDEX_ENDPOINT="projects/xxxxxx/locations/xxxxxx/indexEndpoints/xxxxxx"
DEPLOYED_INDEX_ID="xxxxxx"
QUERY_VECTOR_DIM = 128
NUM_NEIGHBORS_TO_SEARCH = 30
NUM_TOTAL_QUERIES = 1
# NUM_TOTAL_QUERIES = 10
# NUM_TOTAL_QUERIES = 100
# NUM_TOTAL_QUERIES = 1000
# NUM_TOTAL_QUERIES = 10000
# NUM_TOTAL_QUERIES = 100000


# Configure Vector Search client
client_options = {
  "api_endpoint": API_ENDPOINT
}
vector_search_client = aiplatform_v1.MatchServiceClient(
  client_options=client_options,
)

# Prepare request
queries = []
for _ in tqdm(range(NUM_TOTAL_QUERIES)):
    datapoint = aiplatform_v1.IndexDatapoint(
        feature_vector=np.random.rand(QUERY_VECTOR_DIM).astype(np.float32),
    )
    query = aiplatform_v1.FindNeighborsRequest.Query(
      datapoint=datapoint,
      neighbor_count=NUM_NEIGHBORS_TO_SEARCH
    )
    queries.append(query)
request = aiplatform_v1.FindNeighborsRequest(
    index_endpoint=INDEX_ENDPOINT,
    deployed_index_id=DEPLOYED_INDEX_ID,
    queries=queries,
)

# Execute nearest neighbor search
start_time = time.time()
_ = vector_search_client.find_neighbors(
    request
)
end_time = time.time()     
consumed_time = end_time - start_time
throughput = NUM_TOTAL_QUERIES/consumed_time

print("performance stats")
print("="*20)
print("consumed time [sec]")
print(f"{consumed_time:.2f}")
print("-"*20)
print("throughput [query/sec]")
print(f"{throughput:,.0f}")

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up