1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

movielens recommendation with VertexAI Search for Retail

Last updated at Posted at 2024-08-27

このページは

以下のチュートリアルをやってみてmovielensのデータでVertexAI Search for Retailを試してみました。

実行環境

Cloud Shellがとても便利

Dataの準備

gcloud storage buckets create gs://$PROJECT_ID-movielens-data
gcloud storage cp ml-latest/movies.csv ml-latest/ratings.csv   gs://$PROJECT_ID-movielens-data

BigQueryの準備

dataset movielens作成

bq mk $PROJECT_ID:movielens

csvファイルからmoviesテーブルにload

bq load --skip_leading_rows=1 $PROJECT_ID:movielens.movies \
  gs://$PROJECT_ID-movielens-data/movies.csv \
  movieId:integer,title,genres

csvファイルからratingsテーブルにload

bq load --skip_leading_rows=1 $PROJECT_ID:movielens.ratings \
  gs://$PROJECT_ID-movielens-data/ratings.csv \
  userId:integer,movieId:integer,rating:float,time:timestamp

movies テーブルを productのカタログに合わせたviewを作成する

bq mk --project_id=$PROJECT_ID \
 --use_legacy_sql=false \
 --view "
 SELECT
   CAST(movieId AS string) AS id,
   SUBSTR(title, 0, 128) AS title,
   SPLIT(genres, \"|\") AS categories
 FROM \`$PROJECT_ID.movielens.movies\`" \
$PROJECT_ID:movielens.products
  • id: movie_id
  • title
  • categories

ratings を user_eventsに合わせたviewを作成する

  • Rescale the Movielens timeline into the last 90 days. We do this for two reasons:
    • Vertex AI Search for retail requires that user events are no older than 2015. Movielens ratings go back to 1995.
    • Vertex AI Search for retail uses the last 90 days of user events when serving prediction requests for a user. Every user will appear to have recent events when we make predictions for any user later on.
bq mk --project_id=$PROJECT_ID \
 --use_legacy_sql=false \
 --view "
 WITH t AS (
   SELECT
     MIN(UNIX_SECONDS(time)) AS old_start,
     MAX(UNIX_SECONDS(time)) AS old_end,
     UNIX_SECONDS(TIMESTAMP_SUB(
       CURRENT_TIMESTAMP(), INTERVAL 90 DAY)) AS new_start,
     UNIX_SECONDS(CURRENT_TIMESTAMP()) AS new_end
   FROM \`$PROJECT_ID.movielens.ratings\`)
 SELECT
   CAST(userId AS STRING) AS visitorId,
   \"detail-page-view\" AS eventType,
   FORMAT_TIMESTAMP(
     \"%Y-%m-%dT%X%Ez\",
     TIMESTAMP_SECONDS(CAST(
       (t.new_start + (UNIX_SECONDS(time) - t.old_start) *
         (t.new_end - t.new_start) / (t.old_end - t.old_start))
     AS int64))) AS eventTime,
   [STRUCT(STRUCT(movieId AS id) AS product)] AS productDetails,
 FROM \`$PROJECT_ID.movielens.ratings\`, t
 WHERE rating >= 4" \
$PROJECT_ID:movielens.user_events

movieを見て4以上の評価をしているものを detail-page-viewとして扱う

  • visitorId: userId
  • eventType: detail-page-view
  • eventTime
  • productDetails: {id: movie_id}

データのインポート

product catalogのインポート (5~10min)

gcloud services enable retail.googleapis.com --project $PROJECT_ID

gcloudがないのか画面からimport https://console.cloud.google.com/ai/retail/catalogs/default_catalog/data/catalog

Advanced OptionsでBucketを指定しないと新しいGCS Bucketが作成されてしまうので注意

Screenshot 2024-08-24 at 20.34.41.png

importが成功すると以下のコマンドが出てきたので、スケジュールでRetail catalog dataをimportすることもできるよう。

gcloud scheduler --project xxxx \
jobs create http import_catalog_xxxx \
--time-zone='America/Los_Angeles' \
--schedule='0 0 * * *' \
--uri='https://retail.googleapis.com/v2alpha/projects/xxxxx/locations/global/catalogs/default_catalog/branches/0/products:import' \
--description='Import Retail catalog data' \
--headers='Content-Type: application/json; charset=utf-8' \
--http-method='POST' \
--message-body='{"inputConfig":{"bigQuerySource":{"projectId":"xxxx","datasetId":"movielens","tableId":"products","dataSchema":"product"}},"reconciliationMode":"INCREMENTAL"}' \
--oauth-service-account-email=''

インポートのOperationの状態を見ることができる

Screenshot 2024-08-24 at 20.46.03.png

インポートが完了すると以下のように見れるようになる

Screenshot 2024-08-24 at 20.40.56.png

user eventsをインポート(1h)

同様に User EventsもImportする

  • Advanced OptionsでBucketを指定しないと新しいGCS Bucketが作成されてしまうので注意

Screenshot 2024-08-24 at 20.42.22.png

インポートの状態を確認できる

Screenshot 2024-08-24 at 21.10.35.png

User Eventデータの確認

Screenshot 2024-08-24 at 21.11.51.png

Recommendationモデルのトレーニング (2 days)

モデルタイプとBusiness Objectiveなどを選択してモデルをトレーニングできる

モデルタイプ:

  • Recommended for you: home pageのタイムライン向き
  • Others you may like: Detailページで関連Item向き
  • Frequently bought together: Addーtoーcartのすぐ後やDetailページで表示する
  • Similar items: detailページ
  • Buy it again: purchase historyからもう一度購入されそうなItemをお勧めする detailed page, add to cart, shopping cart, category views, home pageなどに使われる
  • Page-level optimization: Automatically optimizes the entire page and catalog item recommendations with multiple recommendation panels <- どういうOptimizeになるのかあまり理解できていない
  • On sale: On saleのProductをお勧めする

Screenshot 2024-08-24 at 21.18.29.png

モデルリスト

Screenshot 2024-08-24 at 21.20.10.png

トレーニングに2日くらいかかるので待つ。なんでこんなにかかるのかは不明。
Screenshot 2024-08-27 at 8.11.26.png

Screenshot 2024-08-27 at 8.09.57.png

Serving Config設定

Create Serving Configのページから Serving Config を作成していく

  1. トレーニングしたModelを選択
  2. Price reranking, Result diversificationなどを選択する
    Screenshot 2024-08-27 at 8.14.15.png
  3. Serving Controlは今回は何も選択肢しない

Serving Configの作成が完了 ✅

Screenshot 2024-08-27 at 8.15.58.png

Model ready to queryがyesとなっているのを確認

Screenshot 2024-08-27 at 8.16.48.png

Predict (Evaluate)

Evaluateタブから、

  • Visitor IdはOptional
  • Product IDは、今回のケースではMovie IDを入れる 4993 ("The Lord of the Rings: The Fellowship of the Ring (2001)")

Screenshot 2024-08-27 at 8.19.45.png

Prediction Resultが右側に出る :tada:

Screenshot 2024-08-27 at 8.21.12.png

Ref

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?