More than 1 year has passed since last update.

データブリックス・ジャパン株式会社

Databricks Feature & Function Servingとは？

Databricks

Posted at 2023-12-11

What is Databricks Feature & Function Serving? | Databricks on AWS [2023/12/8時点]の翻訳です。

本書は抄訳であり内容の正確性を保証するものではありません。正確な内容に関しては原文を参照ください。

プレビュー
本機能はパブリックプレビューです。

Databricks Feature & Function Servingによって、Databricks外にデプロイされたモデルやアプリケーションでDatabricksプラットフォームのデータを利用できるようになります。Feature & Function Servingエンドポイントは、リアルタイムのトラフィックに合わせて自動でスケールし、高可用性かつ低レーテンシーの特徴量サービングを提供します。このページでは、Feature & Function Servingのセットアップ方法を説明します。

Databricksの特徴量を用いて構築されたモデルをサービングするためにDatabricksモデルサービングを使う際、モデルは推論リクエストに対して自動で特徴量を検索、変換します。Databricks Feature & Function Servingを用いることで、retrieval augmented generation (RAG)アプリケーションのための構造化データや、Unity Catalogにあるデータをベースとした特徴量を必要とするDatabricks外のモデルや、その他のアプリケーションでサービングされるモデルが必要とする特徴量をサービングすることができます。

なぜ、Feature & Function Servingを使うのか？

Databricks Feature & Function Servingは、事前にマテリアライズされた特徴量やオンデマンドの特徴量をサービングする単一のインタフェースを提供します。また、以下のようなメリットを提供します:

シンプルさ。Databricksがインフラストラクチャの面倒を見ます。単一のAPIコールによって、プロダクションレベルのサービングエンドポイントが作成されます。
高可用性と高いスケーラビリティ。Feature & Function Servingエンドポイントは、サービングのリクエストのボリュームに応じて自動でスケールアップ、スケールダウンします。
セキュリティ。エンドポイントはセキュアなネットワーク領域にデプロイされ、エンドポイントが削除されるか、ゼロにスケールする際には停止される専用計算資源を使用します。

要件

Databricks Runtime 14.2 ML以降
Feature & Function Servingでは、databricks-feature-storeバージョン0.16.2以降が必要です。Databricks Runtime 14.2 MLにはバージョン0.16.1が含まれています。必要なバージョンを手動でインストールするには、%pip install databricks-feature-store>=0.16.2を実行します。Databricksノートブックを使っている場合には、新規セルでdbutils.library.restartPython()を実行してPythonカーネルを再起動する必要があります。

サンプルノートブック

以下のノートブックでは、Feature & Function Servingエンドポイントの作成方法を説明しています。

Feature & Function Servingサンプルノートブック

`FeatureSpec`の作成

FeatureSpecは特徴量と関数に対するユーザー定義セットとなります。FeatureSpecで特徴量と関数を結合することができます。FeatureSpecはUnity Catalogで管理され、カタログエクスプローラに表示されます。

FeatureSpecで指定されるテーブルは、オンラインストアに公開される必要があります。特徴量のオンラインストアへの公開方法に関しては、Publish features to an online storeをご覧ください。

Python

from databricks.feature_engineering import (
  FeatureFunction,
  FeatureLookup,
  FeatureEngineeringClient,
)

fe = FeatureEngineeringClient()

features = [
  # Lookup column `average_yearly_spend` and `country` from a table in UC by the input `user_id`.
  FeatureLookup(
    table_name="main.default.customer_profile",
    lookup_key="user_id",
    features=["average_yearly_spend", "country"]
  ),
  # Calculate a new feature called `spending_gap` - the difference between `ytd_spend` and `average_yearly_spend`.
  FeatureFunction(
    udf_name="main.default.difference",
    output_name="spending_gap",
    # Bind the function parameter with input from other features or from request.
    # The function calculates a - b.
    input_bindings={"a": "ytd_spend", "b": "average_yearly_spend"},
  ),
]

# Create a `FeatureSpec` with the features defined above.
# The `FeatureSpec` can be accessed in Unity Catalog as a function.
fe.create_feature_spec(
  name="main.default.customer_features",
  features=features,
)

エンドポイントの作成

FeatureSpecでエンドポイントを定義します。詳細については、Create and manage model serving endpointsとthe API documentationをご覧ください。

Python

from databricks.feature_engineering.entities.feature_serving_endpoint import (
  ServedEntity,
  EndpointCoreConfig,
)

fe.create_feature_serving_endpoint(
  name="customer-features",
    config=EndpointCoreConfig(
    served_entities=ServedEntity(
      feature_spec_name="main.default.customer_features",
             workload_size="Small",
             scale_to_zero_enabled=True,
             instance_profile_arn=None,
    )
  )
)

エンドポイントを確認するには、Databricks UIの左のサイドバーでServingをクリックします。状態がReadyになったら、エンドポイントにクエリーを行うことができます。Databricksモデルサービングの詳細については、Databricks Model Servingをご覧ください。

エンドポイントの取得

エンドポイントのメタデータと状態を取得するには、API get_feature_serving_endpointを使います。

Python

endpoint = fe.get_feature_serving_endpoint(name="customer-features")
# print(endpoint)

エンドポイントへのクエリー

サービングエンドポイントを試す最も簡単な方法はサービングUIを使うというものです:

Databricksワークスペースの左のナビゲーションバーでServingをクリックします。
クエリーを行いたいエンドポイントをクリックします。
画面右上のQuery endpointをクリックします。
RequestボックスにJSONフォーマットでリクエストボディを入力します。
Send requestをクリックします。
JSON
```
// Example of a request body.
{
  "dataframe_records": [
    {"user_id": 1, "ytd_spend": 598},
    {"user_id": 2, "ytd_spend": 280}
  ]
}
```
Query endpointダイアログには、curl、Python、SQLでのサンプルコードが含まれています。サンプルコードを参照、コピーするにはタブをクリックします。

コードをコピーするには、テキストボックスの右上のコピーアイコンをクリックします。

エンドポイントの削除

警告！
この操作は取り消せません。

Python

fe.delete_feature_serving_endpoint(name="customer-features")

Databricksクイックスタートガイド

Databricks無料トライアル

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Databricks Feature & Function Servingとは？

なぜ、Feature & Function Servingを使うのか？

要件

サンプルノートブック

Feature & Function Servingサンプルノートブック

FeatureSpecの作成

エンドポイントの作成

エンドポイントの取得

エンドポイントへのクエリー

エンドポイントの削除

Databricksクイックスタートガイド

Databricks無料トライアル

`FeatureSpec`の作成