概要
Vertex AI Agent Builder Media searchを使ってSearchエンジンを作ってみる。
内容は、Get started with media searchの中身をベースにしています。
まとめ
- VertexAI Agent Builderで簡単にSearchエンジンを作成できる
- Data SourceはGCSやBQに対応している
- 必要な形への変換はBQのViewを使うことで簡単に利用可能
- Recommendationも同様のデータソースを利用できるので、検索とレコメンデーションを高速に実装できる -> Recommendationはこちら
- Applicationはlibraryがあるので他のGoogle Service同様に簡単にクライアントで呼び出しができる
- BigQueryからのDataのLoadが定期インポートがサポートされている
(Import from BigQuery - Periodic ingestion)
Prepare BQ Dataset
bqのdataset movielens
作成
bq mk $PROJECT_ID:movielens
csvからmovies テーブルをload
bq load --skip_leading_rows=1 movielens.movies \
gs://cloud-samples-data/gen-app-builder/media-recommendations/movies.csv \
movieId:integer,title,genres
csvからratingテーブルをload
bq load --skip_leading_rows=1 movielens.ratings \
gs://cloud-samples-data/gen-app-builder/media-recommendations/ratings.csv \
userId:integer,movieId:integer,rating:float,time:timestamp
view movies_view
を作成
bq mk --project_id=$PROJECT_ID \
--use_legacy_sql=false \
--view "
WITH t AS (
SELECT
CAST(movieId AS string) AS id,
SUBSTR(title, 0, 128) AS title,
SPLIT(genres, \"|\") AS categories
FROM \`$PROJECT_ID.movielens.movies\`)
SELECT
id, \"default_schema\" as schemaId, null as parentDocumentId,
TO_JSON_STRING(STRUCT(title as title, categories as categories,
CONCAT(\"http://mytestdomain.movie/content/\", id) as uri,
\"2023-01-01T00:00:00Z\" as available_time,
\"2033-01-01T00:00:00Z\" as expire_time,
\"movie\" as media_type)) as jsonData
FROM t;" \
$PROJECT_ID:movielens.movies_view
view user_events_for_search
(user_events
is already used in https://qiita.com/nakamasato/items/012ea7159d3e3fc8e30e)
bq mk --project_id=$PROJECT_ID \
--use_legacy_sql=false \
--view "
WITH t AS (
SELECT
MIN(UNIX_SECONDS(time)) AS old_start,
MAX(UNIX_SECONDS(time)) AS old_end,
UNIX_SECONDS(TIMESTAMP_SUB(
CURRENT_TIMESTAMP(), INTERVAL 90 DAY)) AS new_start,
UNIX_SECONDS(CURRENT_TIMESTAMP()) AS new_end
FROM \`$PROJECT_ID.movielens.ratings\`)
SELECT
CAST(userId AS STRING) AS userPseudoId,
\"view-item\" AS eventType,
FORMAT_TIMESTAMP(\"%Y-%m-%dT%X%Ez\",
TIMESTAMP_SECONDS(CAST(
(t.new_start + (UNIX_SECONDS(time) - t.old_start) *
(t.new_end - t.new_start) / (t.old_end - t.old_start))
AS int64))) AS eventTime,
[STRUCT(movieId AS id, null AS name)] AS documents,
FROM \`$PROJECT_ID.movielens.ratings\`, t
WHERE rating >= 4;" \
$PROJECT_ID:movielens.user_events_for_search
document nameは常にnullが入り、document idにはmovie idが入っている
Search appの作成
https://console.cloud.google.com/gen-app-builder/engines このページからSearchを選択 (RecommendationをやりたいばあいにはRecommendationを使う)
Mediaを選択し、appに名前をつける
Data storeの作成
データのインポート
Documentのインポート
BigQueryまたはCloud Storageが選択できるので、今回はbigqueryを選択
Datasetを選択
先ほど作成したmovielensのmovies_view
を選択してimportする
importには15分くらいかかるので完了を待つ
86,537
documents importされる
User Eventのインポート
次にEventタブから Import Events
をクリックする
同様にBigQueryを選択して、user_events_for_search
(上でインポートしたテーブル)を指定して、インポート
完了
Search Configuration
The Lord of the Rings
を検索してみる
検索のコンディションを選択する
決定後[Save and Publish]をクリック
Search Widgetの設定
↓のような画面でJWT or OAuth based またはpublic accessを選択できる
Domainの設定も可能。
Search APIを呼ぶ
Run in Cloud Shell
を実行すると実際に検索結果を取得することができる
使われたQueryはこれ:
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://discoveryengine.googleapis.com/v1alpha/projects/<project_number>/locations/global/collections/default_collection/engines/quickstart-media-search/servingConfigs/default_search:search" \
-d '{"query":"load of the ring","pageSize":10,"queryExpansionSpec":{"condition":"AUTO"},"spellCorrectionSpec":{"mode":"AUTO"}}'
Search APIをApplicationに組み込む
こちらを参考にADCと一緒にClientを実装するとできる
例:
package main
import (
"context"
"fmt"
"os"
discoveryengine "cloud.google.com/go/discoveryengine/apiv1beta"
discoveryenginepb "cloud.google.com/go/discoveryengine/apiv1beta/discoveryenginepb"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
)
// search searches for a query in a search engine given the Google Cloud Project ID,
// Location, and Search Engine ID.
//
// This example uses the default search engine.
func search(projectID, location, dataStoreID, query string) error {
ctx := context.Background()
// Create a client
endpoint := "discoveryengine.googleapis.com:443" // Default to global endpoint
if location != "global" {
endpoint = fmt.Sprintf("%s-%s", location, endpoint)
}
client, err := discoveryengine.NewSearchClient(ctx, option.WithEndpoint(endpoint))
if err != nil {
return fmt.Errorf("creating Vertex AI Search client: %w", err)
}
defer client.Close()
// Full resource name of search engine serving config
servingConfig := fmt.Sprintf("projects/%s/locations/%s/collections/default_collection/dataStores/%s/servingConfigs/default_serving_config",
projectID, location, dataStoreID)
searchRequest := &discoveryenginepb.SearchRequest{
ServingConfig: servingConfig,
Query: query,
}
it := client.Search(ctx, searchRequest)
for {
resp, err := it.Next()
if err == iterator.Done {
break
}
if err != nil {
return err
}
fmt.Printf("%+v\n", resp)
}
return nil
}
func main() {
query := "Terminator"
if err := search(os.Getenv("GCP_PROJECT_ID"), os.Getenv("LOCATION"), os.Getenv("DATA_STORE_ID"), query); err != nil {
fmt.Println(err)
}
}
export GCP_PROJECT_ID=xxx
export LOCATION=global
export DATA_STORE_ID=quickstart-media-data-store
go run main.go
id:"175137" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Documentary"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"The Making of 'The Terminator': A Retrospective (1992)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/175137"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/175137" id:"175137"}
id:"207830" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Sci-Fi"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Terminator: Dark Fate (2019)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/207830"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/207830" id:"207830"}
id:"6537" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Adventure"} values:{string_value:"Sci-Fi"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Terminator 3: Rise of the Machines (2003)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/6537"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/6537" id:"6537"}
id:"120799" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Adventure"} values:{string_value:"Sci-Fi"} values:{string_value:"Thriller"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Terminator Genisys (2015)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/120799"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/120799" id:"120799"}
id:"212887" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"(no genres listed)"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Other Voices: Creating 'The Terminator' (2001)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/212887"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/212887" id:"212887"}
id:"589" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Sci-Fi"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Terminator 2: Judgment Day (1991)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/589"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/589" id:"589"}
id:"177415" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Documentary"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"The Making of 'Terminator 2: Judgment Day' (1991)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/177415"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/177415" id:"177415"}
id:"68791" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Adventure"} values:{string_value:"Sci-Fi"} values:{string_value:"Thriller"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Terminator Salvation (2009)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/68791"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/68791" id:"68791"}
id:"136200" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Horror"} values:{string_value:"Sci-Fi"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"The Terminators (2009)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/136200"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/136200" id:"136200"}
id:"228409" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Ninja Terminator (1985)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/228409"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/228409" id:"228409"}
id:"102425" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Adventure"} values:{string_value:"Horror"} values:{string_value:"Sci-Fi"} values:{string_value:"Thriller"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Lady Terminator (Pembalasan ratu pantai selatan) (1989)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/102425"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/102425" id:"102425"}
id:"139909" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Russian Terminator (1989)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/139909"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/139909" id:"139909"}
id:"214344" document:{struct_data:{fields:{key:"available_time" value:{string_value:"2023-01-01T00:00:00Z"}} fields:{key:"categories" value:{list_value:{values:{string_value:"Action"} values:{string_value:"Drama"}}}} fields:{key:"expire_time" value:{string_value:"2033-01-01T00:00:00Z"}} fields:{key:"media_type" value:{string_value:"movie"}} fields:{key:"title" value:{string_value:"Angel Terminators 2 (1993)"}} fields:{key:"uri" value:{string_value:"http://mytestdomain.movie/content/214344"}}} name:"projects/267676219654/locations/global/collections/default_collection/dataStores/quickstart-media-data-store/branches/0/documents/214344" id:"214344"}
DataのImport
Import from BigQuery、Import from Cloud Storageでは Periodic ingestion
(Public Preview) がサポートされているので、定期的にデータを更新することも可能 (1,3,5日に1回のどれか)
Periodic ingestion: You import data from one or more BigQuery tables, and you set a sync frequency that determines how often the data stores are updated with the most recent data from the BigQuery dataset.
Data updates automatically every 1, 3, or 5 days. Data cannot be manually refreshed.
詳細は、Create a search data store