watsonx.governanceでwatsonx.ai以外（3rd Party）のRAGの評価

Posted at 2025-04-07

PROJECTの作成＆PROJECT_IDの取得

プロンプトテンプレートの作成

# %%
!pip install ibm_aigov_facts_client setuptools tabulate

# %%
PROJECT_ID = "<INPUT YOUR PROJECT_ID>"

#%%
CPD_URL = "<INPUT YOUR CPD_URL>"
CPD_USERNAME = "<INPUT YOUR CPD_USERNAME>"
CPD_PASSWORD = "<INPUT YOUR CPD_PASSWORD>"
CPD_API_KEY = "<INPUT YOUR CPD_API_KEY>"

# %%
from ibm_aigov_facts_client import AIGovFactsClient
from ibm_aigov_facts_client import CloudPakforDataConfig

creds=CloudPakforDataConfig(
    service_url=CPD_URL,
    username=CPD_USERNAME,
    api_key=CPD_API_KEY
)
facts_client = AIGovFactsClient(
    cloud_pak_for_data_configs=creds,
    container_id=PROJECT_ID,
    container_type="project",
    disable_tracing=True
)

# %%
from ibm_aigov_facts_client import DetachedPromptTemplate, PromptTemplate

detached_information = DetachedPromptTemplate(
    prompt_id="detached_prompt",
    model_id="meta-llama/llama-3-70b-instruct",
    model_provider="Facebook",
    model_name="llama-3-70b-instruct",
    model_url="https://us-south.ml.cloud.ibm.com/ml/v1/deployments/insurance_test_deployment/text/generation?version=2021-05-01",
    prompt_url="prompt_url",
    prompt_additional_info={"IBM Cloud Region": "us-east1"}
)

# %%
prompt_input="""
[INST] <>You are an assistant named Buddy who helps customers of an Australian online-only insurer named Bingle. Being online only, you should not suggest contacting Bingle. You should only answer queries related to car insurance. Your answer must be general in nature. You should only use provided information from the document to generate your answer. If the answer to the question is not in the provided document reply with,  "I am sorry but unfortunately I do not have information to help you.".  Use Australian spelling and Australian insurance terminology. If you do not respond in the persona of Buddy, users who are on the Bingle website will be confused. You should maintain a friendly customer service tone.<> 
    Here is the document you should use to answer the user:
    {context1}\n{context2}\n{context3}
    Here are some important rules for the interaction:
    - Always stay in character, as Buddy from Bingle and answer user questions in first person.
    - If you are unsure how to respond, say “I am sorry but unfortunately I do not have information to help you.”.
    - If someone asks something irrelevant, say, “Sorry, I am Buddy and I can help with Car Insurance. Do you have an insurance related question today I can help you with?”.
    - Never mention or suggest calling, emailing, writing or contacting customer services or Bingle.
    - If the answer is not in the document answer with: "I am sorry but unfortunately I do not have information to help you.”.
    - If the document provides instruction ensure to list them out so that the user understands the process to take, this will be extremly helpful.
    - If no insurance cover type (comprehensive or third party cover) is mentioned in the user question then always provide an answer for both types of cover".
    - If a specific type of insurance cover (comprehensive or third party cover) is mentioned in the user question then respond with an answer only for that cover type.
    - Remember, user have already looked at the Bingle website so do not suggest them to check our website as this will be condescending and rude, instead suggest they review the Product Disclosure Statement (PDS).
    
    Here is an example of how to respond in a standard interaction:
    Users question: Hi, how were you created and what do you do? [/INST]
    Step 1: Check if the question is related to Bingle car insurance. Yes, the question is related to car insurance, specifically the insurance Buddy that is me.
    Step 2: Check if the answer can be found in the provided document. The context does mention information about Buddy and how I am an AI assistant to help them.
    Step 3: Provide the answer in structured json. 
    ANSWER: {{"answer": "Hello! My name is Buddy, and I was created by Bingle to help you with information about Bingles insurance services. What can I help you with today?."}} 
    [INST]
    Here is another example of how to respond in a standard interaction:
    Users question: Hi can I get housing insurance? [/INST]
    Step 1: Check if the question is related to Bingle car insurance. No, the question is not related to car insurance.
    Step 2: Check if the answer can be found in the provided document. The document does not mention information about housing insurance.
    Step 3: Provide the answer in structured json. 
    ANSWER: {{"answer": "I'm sorry, but unfortunately I don't have information to help you with that question."}}
    [INST] 
    Here is another example of how to respond in a standard interaction:
    Users question: Hi does Bingle offer a rental car while my car is being repaired? [/INST]
    Step 1: Check if the question is related to Bingle car insurance. Yes, the question is related to car insurance, specifically the getting a rental car while their car is being repaired.
    Step 2: Check if the answer can be found in the provided document. The document does mention information about rental cars and how they are provided in the comprehensive policy with the Keep Mobile option.
    Step 3: Provide the answer in structured json. 
    ANSWER: {{"answer": "Yes I can help with that, for an extra premium our Comprehensive Policy offers a Keep Mobile option which includes unlimited, car hire and Copycat cover. Our Third Party Policy does not include the Keep Mobile option. I hope this information helps for further details please review the Product Disclosure Statement (PDS)"}}
    [INST] 
    Please think step by step when the user asks you a question and decide if the content is actually in the document provided. Work through these steps, then provide the answer in structured json format.
    
    User Question: {question} [/INST]
    Step 1: Check if the question is related to Bingle car insurance.

"""
# %%

# %%
prompt_variables = {"context1": "","context2": "","context3": "","question": ""}
input = prompt_input
input_prefix= ""
output_prefix= ""

# %%
prompt_template = PromptTemplate(
    input=input,
    prompt_variables=prompt_variables,
    input_prefix=input_prefix,
    output_prefix=output_prefix,
)

# %%
model_id = "meta-llama/llama-3-70b-instruct"
task_id = "retrieval_augmented_generation"
name = "sample detached prompt template"
description = "sample detached prompt template for RAG based on meta-llama/llama-3-70b-instruct"

# %%
response = facts_client.assets.create_detached_prompt(
    model_id=model_id,
    task_id=task_id,
    name=name,
    description=description,
    prompt_details=prompt_template,
    detached_information=detached_information)

# %%
response

プロンプトテンプレートの確認＆PROMPT_TEMPLATE_ASSET_IDの取得

評価の実行

# %%
!pip install ibm_watson_openscale
!pip install "pandas<=2.1.9"

# %%
PROJECT_ID = "<INPUT YOUR PROJECT_ID>"
PROMPT_TEMPLATE_ASSET_ID = "<INPUT YOUR PROMPT_TEMPLATE_ASSET_ID>" # GUIで確認可能

# %%
CPD_URL = "<INPUT YOUR CPD_URL>"
CPD_USERNAME = "<INPUT YOUR CPD_USERNAME>"
CPD_PASSWORD = "<INPUT YOUR CPD_PASSWORD>"
CPD_API_KEY = "<INPUT YOUR CPD_API_KEY>"

# %%
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

authenticator = CloudPakForDataAuthenticator(
    url=CPD_URL,
    username=CPD_USERNAME,
    password=CPD_PASSWORD,
    disable_ssl_verification=True
)

wos_client = APIClient(
    service_url=CPD_URL,
    authenticator=authenticator,
)
data_mart_id = wos_client.service_instance_id
print(data_mart_id)
print(wos_client.version)

# %%
from ibm_watson_openscale.base_classes import ApiRequestFailure

try:
  wos_client.wos.add_instance_mapping(                
    service_instance_id=data_mart_id,
    project_id=PROJECT_ID
  )
except ApiRequestFailure as arf:
    if arf.response.status_code == 409:
      # Instance mapping already exists
      pass
    else:
      raise arf

# %% <- GUIでも実行可能
label_column = "answer"
context_fields = ["context1", "context2", "context3"]
question_field = "question"
operational_space_id = "development"
problem_type = "retrieval_augmented_generation"
input_data_type = "unstructured_text"
monitors = {
    "generative_ai_quality": {
        "parameters": {
            "min_sample_size": 10,
            "metrics_configuration": {                    
            }
        }
    }
}

response = wos_client.monitor_instances.mrm.execute_prompt_setup(
    prompt_template_asset_id=PROMPT_TEMPLATE_ASSET_ID, 
    project_id=PROJECT_ID,
    label_column=label_column,
    context_fields=context_fields,     
    question_field=question_field,     
    operational_space_id=operational_space_id, 
    problem_type=problem_type,
    input_data_type=input_data_type, 
    supporting_monitors=monitors, 
    background_mode=True)

# %%
result = response.result
result.to_dict()

# %%
response = wos_client.monitor_instances.mrm.get_prompt_setup(
    prompt_template_asset_id=PROMPT_TEMPLATE_ASSET_ID,
    project_id=PROJECT_ID
)

# %%
result = response.result
result.to_dict()

# %% <- GUIから取得可能
DEPLOYMENT_ID = "591bed11-b8ab-4c82-9b04-89c649c75d21"
SUBSCRIPTION_ID = "09b35184-4cab-40c7-ac82-9cd6fe13a33f"

# %%
wos_client.monitor_instances.show(target_target_id=SUBSCRIPTION_ID)

# %%
monitor_definition_id = "mrm"

# %%
data_mart_id = wos_client.service_instance_id
monitor_definition_id = "mrm"
project_id = PROJECT_ID
response = wos_client.monitor_instances.list(
    data_mart_id=data_mart_id,
    monitor_definition_id=monitor_definition_id,
    target_target_id=SUBSCRIPTION_ID,
    project_id=project_id
)

# %%
monitor_instance_id = response.result._to_dict()['monitor_instances'][0]['metadata']['id']
print(monitor_instance_id)

# %%
test_data_set_name = "data"
content_type = "multipart/form-data"
body = {}

# %%
!wget https://ibm.box.com/shared/static/3ysiqmcqzemlbp68pc7dg7homj5jjztt.csv
!mv 3ysiqmcqzemlbp68pc7dg7homj5jjztt.csv RAG_data.csv

# %%
import pandas as pd

filepath = "RAG_data.csv"
df = pd.read_csv(filepath_or_buffer=filepath)
df.head()

# %%
df.shape

# %%
import csv

test_data_path = "test.csv"
df.iloc[:10].to_csv(test_data_path, index=False, quoting=csv.QUOTE_ALL)
#df.iloc[10:20].to_csv(test_data_path, index=False, quoting=csv.QUOTE_ALL)
#df.iloc[10:30].to_csv(test_data_path, index=False, quoting=csv.QUOTE_ALL)
#df.to_csv(test_data_path, index=False, quoting=csv.QUOTE_ALL)
#df.iloc[30:].to_csv(test_data_path, index=False, quoting=csv.QUOTE_ALL)

# %%
response = wos_client.monitor_instances.mrm.evaluate_risk(
    monitor_instance_id=monitor_instance_id,
    test_data_set_name=test_data_set_name, 
    test_data_path=test_data_path,
    content_type=content_type,
    body=body,
    project_id=PROJECT_ID,
    includes_model_output=True,
    background_mode=True)

# %%
response = wos_client.monitor_instances.mrm.get_risk_evaluation(monitor_instance_id, project_id=PROJECT_ID)
response.result.to_dict()

# %%

DEPLOYMENT_IDとSUBSCRIPTION_IDの取得

GUIもしくはPythonのSDK（wos_client.monitor_instances.mrm.execute_prompt_setup()）でEvaluateを有効にしてから

結果の確認

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up