0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

watsonx.governanceでwatsonx.ai以外(3rd Party)のRAGの監視

Last updated at Posted at 2025-04-07

PROMPTE TEMPLATEをSPACEにPROMOTE&DEPLOY

image.png
image.png
image.png
image.png
image.png
image.png

Evaluateを有効化

image.png
image.png
image.png
image.png

SPACE_IDとSUBSCRIPTION_IDの取得

image.png
image.png

Payloadデータの送信

Payloadデータ、コンテキストと質問、それらをもとにLLMで生成された回答などと言った入出力データ。ここでは正解は含まれてない。Payloadデータの送信&評価はステージがproduction(実働)である必要がある。例えば、生成AIアプリケーションの開発ステージをProductionにして、生成AIのアプリケーションの入出力データをwatsonx.governanceに送付しれば、定期的に評価され、それらを基に生成AIアプリのモニタリングが可能になる。

# %%
PROJECT_ID = "<INPUT YOUR PROJECT_ID>"
PROMPT_TEMPLATE_ASSET_ID = "<INPUT YOUR PROMPT_TEMPLATE_ASSET_ID>"

# %%
CPD_URL = "<INPUT YOUR CPD_URL>"
CPD_USERNAME = "<INPUT YOUR CPD_USERNAME>"
CPD_PASSWORD = "<INPUT YOUR CPD_PASSWORD>"
CPD_API_KEY = "<INPUT YOUR CPD_API_KEY>"

# %%
SPACE_ID = "<INPUT YOUR SPACE_ID>"
SUBSCRIPTION_ID = "<INPUT YOUR SUBSCRIPTION_ID>"

# %%
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

authenticator = CloudPakForDataAuthenticator(
    url=CPD_URL,
    username=CPD_USERNAME,
    password=CPD_PASSWORD,
    disable_ssl_verification=True
)

wos_client = APIClient(
    service_url=CPD_URL,
    authenticator=authenticator,
)
data_mart_id = wos_client.service_instance_id
print(data_mart_id)
print(wos_client.version)

# %%
wos_client.monitor_instances.show(target_target_id=SUBSCRIPTION_ID)

# %%
monitor_definition_id = "mrm"
response = wos_client.monitor_instances.list(
   data_mart_id=data_mart_id,
   monitor_definition_id=monitor_definition_id,
   target_target_id=SUBSCRIPTION_ID,
   space_id=SPACE_ID)
response.result.to_dict()
# %%
monitor_instance_id = response.result.to_dict()["monitor_instances"][0]["metadata"]["id"]
print(monitor_instance_id)

# %%
import csv

context_fields = ["context1", "context2", "context3"]
question_field = "question"
feature_fields = context_fields + [question_field]
prediction = "generated_text"

pl_data = []
prediction_list = []

test_data_path = "test.csv"
with open(test_data_path, 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        request = {
            "parameters": {
                "template_variables": {
                }
            }
        }
        for each in feature_fields:
            request["parameters"]["template_variables"][each] = str(row[each])

        predicted_val = row[prediction]
        prediction_list.append(predicted_val)
        response = {
            "results": [
                {
                    prediction: predicted_val
                }
            ]
        }
        record = {"request": request, "response": response}
        pl_data.append(record)
pl_data

# %%
import json

print(json.dumps(pl_data, indent=2, ensure_ascii=False))

# %%
from ibm_watson_openscale.supporting_classes.enums import *

data_set_id = None
response = wos_client.data_sets.list(
    type=DataSetTypes.PAYLOAD_LOGGING, 
    target_target_id=SUBSCRIPTION_ID, 
    target_target_type=TargetTypes.SUBSCRIPTION)
response.result.to_dict()

# %%
data_set_id = response.result.to_dict()['data_sets'][0]['metadata']['id']
print(data_set_id)

# %%
response = wos_client.data_sets.store_records(
   data_set_id=data_set_id,
   request_body=pl_data,
   background_mode=True)
response.result.to_dict()

# %%
wos_client.data_sets.get_records_count(data_set_id=data_set_id)

# %%

Payloadデータによる手動評価

Payloadとして送付された入出力データを利用して1時間毎に評価するけど、GUIで手動評価も可能。
image.png
image.png
image.png

Feedbackデータの送信

Feedbackデータ、コンテキストと質問、それらをもとにLLMで生成された回答などと言った入出力データに正解含まれたもの。Feedbackデータの送信&評価はステージがproduction(実働)である必要がある。例えば、生成AIアプリをProductionにして、生成AIのアプリケーションの入出力データと正解をwatsonx.governanceに送付しれば、定期的に評価され、それらを基に生成AIアプリのモニタリングが可能になる。

# %%
PROJECT_ID = "<INPUT YOUR PROJECT_ID>"
PROMPT_TEMPLATE_ASSET_ID = "<INPUT YOUR PROMPT_TEMPLATE_ASSET_ID>"

# %%
CPD_URL = "<INPUT YOUR CPD_URL>"
CPD_USERNAME = "<INPUT YOUR CPD_USERNAME>"
CPD_PASSWORD = "<INPUT YOUR CPD_PASSWORD>"
CPD_API_KEY = "<INPUT YOUR CPD_API_KEY>"

# %%
SPACE_ID = "<INPUT YOUR SPACE_ID>"
SUBSCRIPTION_ID = "<INPUT YOUR SUBSCRIPTION_ID>"

# %%
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

authenticator = CloudPakForDataAuthenticator(
    url=CPD_URL,
    username=CPD_USERNAME,
    password=CPD_PASSWORD,
    disable_ssl_verification=True
)

wos_client = APIClient(
    service_url=CPD_URL,
    authenticator=authenticator,
)
data_mart_id = wos_client.service_instance_id
print(data_mart_id)
print(wos_client.version)

# %%
wos_client.monitor_instances.show(target_target_id=SUBSCRIPTION_ID)

# %%
monitor_definition_id = "mrm"
response = wos_client.monitor_instances.list(
   data_mart_id=data_mart_id,
   monitor_definition_id=monitor_definition_id,
   target_target_id=SUBSCRIPTION_ID,
   space_id=SPACE_ID)
response.result.to_dict()
# %%
monitor_instance_id = response.result.to_dict()["monitor_instances"][0]["metadata"]["id"]
print(monitor_instance_id)

# %%
from ibm_watson_openscale.supporting_classes.enums import *

data_set_id = None
response = wos_client.data_sets.list(
    type=DataSetTypes.FEEDBACK, 
    target_target_id=SUBSCRIPTION_ID, 
    target_target_type=TargetTypes.SUBSCRIPTION)
response.result.to_dict()

# %%
data_set_id = response.result.to_dict()['data_sets'][0]['metadata']['id']
print(data_set_id)

# %%
import pandas as pd

test_data_path = "test.csv"
llm_data = pd.read_csv(test_data_path)
llm_data.head()

# %%
llm_data.shape

# %%
test_data_content = []
context_fields = ["context1", "context2", "context3"]
question_field = "question"
feature_fields = context_fields + [question_field]   #For Alternative Dataset
prediction_list = llm_data["generated_text"].tolist()
label_column = "answer"

for _, row in llm_data.iterrows():
    # Read each row from the DataFrame and add label and prediction values
    result_row = [row[key] for key in feature_fields if key in row]
    result_row.append(row[label_column])
    result_row.append(row["generated_text"])
    test_data_content.append(result_row)

if len(test_data_content) == 10: # 10 records are there in the downloaded CSV
    print("generated feedback data from DataFrame")
else:
    print("Failed to generate feedback data from DataFrame, kindly verify the DataFrame content")

fields = feature_fields.copy()
fields.append(label_column)
fields.append("_original_prediction")
feedback_data = [
    {
        "fields": fields,
        "values": test_data_content
    }
]

feedback_data
# %%
import json

print(json.dumps(feedback_data, indent=2, ensure_ascii=False))

# %%
response = wos_client.data_sets.store_records(
    data_set_id=data_set_id,
    request_body=feedback_data,
    background_mode=True)
response.result.to_dict()

# %%
wos_client.data_sets.get_records_count(data_set_id=data_set_id)

# %%

Feedbackデータによる手動評価

Payloadと同じで、Feedbackデータとして送付された入出力データを利用して1時間毎に評価するけど、GUIで手動評価も可能。

image.png

image.png

考察というか、watsonx.governanceの整理

開発フェーズでは、まず PROJECT に PROMPT TEMPLATE を作成し、watsonx.governance で評価を行う。
評価結果に一定の目処がついたら、SPACE に PRODUCTION としてデプロイする。
本番稼働中の Web アプリケーションでは、入出力データを watsonx.governance に送信することで、定期的な評価が可能となる。
ユーザーはこれをモニタリングすることで、生成AIのユースケースを継続的に監視・改善できる。

次は、watsonx.governanceでモニタリングしているRAGのデモを作ってみよう。。。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?