Azure AI Foundry SDKを始めよう！- Part 3: 評価・トレース・本番運用編

Last updated at 2025-10-31Posted at 2025-10-31

はじめに

Part 1、Part 2と進めてきて、基本的なAIアプリが作れるようになりましたね！

Part 3では、いよいよ本番運用に向けた重要なテーマを扱います。「AIアプリってどうやって品質を測るの？」「本番で何か問題が起きたらどうする？」そんな疑問に答えていきます。

この記事で学ぶこと:

AIアプリの品質評価方法
トレーシング機能で動作を可視化
複数のAIサービスとの連携
本番環境へのデプロイ準備
セキュリティとベストプラクティス

AIアプリの評価 - どうやって品質を測る？

普通のプログラムなら「期待通りの出力が返ってきたらOK」ですよね。でもAIは毎回違う答えを返すので、評価が難しいんです。

評価が必要な理由

例えばこんな問題:

質問: 「東京の人口は？」

AIの応答1: 「約1400万人です」  ← 正解！
AIの応答2: 「首都である東京には多くの人が住んでいます」  ← 質問に答えてない...
AIの応答3: 「大阪は...」  ← 完全に間違い！

どうやってこれを自動でチェックする？

Azure AI Foundryの評価機能

Azure AI Foundryには、AIアプリを評価する機能が組み込まれています。

評価できること:

関連性（Relevance）: 質問に対して適切な答えか？
一貫性（Coherence）: 論理的で理解しやすいか？
流暢性（Fluency）: 自然な文章か？
根拠性（Groundedness）: 提供した情報に基づいているか？（RAGの場合）
安全性（Safety）: 有害なコンテンツを含んでいないか？

基本的な評価コード

# evaluation_basic.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import Evaluation

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

# テストデータを準備
test_cases = [
    {
        "question": "東京の人口は？",
        "expected_type": "具体的な数値"
    },
    {
        "question": "Pythonとは何ですか？",
        "expected_type": "プログラミング言語の説明"
    },
    {
        "question": "地球温暖化の原因は？",
        "expected_type": "科学的な説明"
    }
]

# それぞれの質問をAIに投げて、応答を取得
results = []
for test in test_cases:
    response = project.inference.get_chat_completions(
        model=os.getenv("MODEL_DEPLOYMENT_NAME"),
        messages=[{"role": "user", "content": test["question"]}]
    )
    
    answer = response.choices[0].message.content
    
    results.append({
        "question": test["question"],
        "answer": answer,
        "expected_type": test["expected_type"]
    })
    
    print(f"\n質問: {test['question']}")
    print(f"回答: {answer}")
    print(f"期待: {test['expected_type']}")

評価指標の実装

もう少し高度な評価を実装してみましょう。

# advanced_evaluation.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

def evaluate_response(question, answer, criteria):
    """
    AIの応答を別のAIに評価させる
    （AI-as-a-Judge パターン）
    """
    evaluation_prompt = f"""
以下の質問と回答を評価してください。

質問: {question}
回答: {answer}

評価基準: {criteria}

1から5のスコアを付けてください（5が最高）。
また、簡潔な理由も述べてください。

フォーマット:
スコア: [1-5]
理由: [評価理由]
"""
    
    response = project.inference.get_chat_completions(
        model=os.getenv("MODEL_DEPLOYMENT_NAME"),
        messages=[
            {"role": "system", "content": "あなたは厳格な評価者です。"},
            {"role": "user", "content": evaluation_prompt}
        ],
        temperature=0.3  # 評価は一貫性が重要なので低めに設定
    )
    
    return response.choices[0].message.content

# テストケース
qa_pair = {
    "question": "機械学習とは何ですか？",
    "answer": "機械学習は、コンピュータがデータから自動的に学習し、パターンを見つけ出す技術です。"
}

# 複数の観点で評価
criteria_list = [
    "正確性: 技術的に正確な説明か",
    "わかりやすさ: 初心者でも理解できるか",
    "完全性: 重要な要素を網羅しているか"
]

print(f"質問: {qa_pair['question']}")
print(f"回答: {qa_pair['answer']}\n")

for criteria in criteria_list:
    print(f"\n=== {criteria} ===")
    evaluation = evaluate_response(
        qa_pair['question'],
        qa_pair['answer'],
        criteria
    )
    print(evaluation)

AI-as-a-Judge パターンとは？

AIの応答を、別のAIに評価させる手法
人間が全部チェックするのは大変なので、AIに一次評価をしてもらう
最終的には人間がサンプルをチェックして、評価AIの精度を確認

バッチ評価

複数のテストケースを一気に評価する実用的なコードです。

# batch_evaluation.py
import os
import json
from datetime import datetime
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)
model_name = os.getenv("MODEL_DEPLOYMENT_NAME")

# テストスイート（テストケースの集まり）
test_suite = [
    {"id": 1, "question": "東京の人口は？", "category": "事実確認"},
    {"id": 2, "question": "Pythonの特徴は？", "category": "技術説明"},
    {"id": 3, "question": "環境問題の解決策は？", "category": "意見・提案"},
    {"id": 4, "question": "1+1は？", "category": "計算"},
    {"id": 5, "question": "おすすめの本は？", "category": "推薦"}
]

def run_test_suite(test_suite, model_name):
    """テストスイートを実行"""
    results = []
    
    for test in test_suite:
        print(f"テスト {test['id']}/{len(test_suite)} を実行中...")
        
        try:
            # AIに質問
            response = project.inference.get_chat_completions(
                model=model_name,
                messages=[{"role": "user", "content": test["question"]}],
                temperature=0.7
            )
            
            answer = response.choices[0].message.content
            
            # 簡易評価（文字数チェック）
            length_ok = len(answer) > 20  # 20文字以上
            
            result = {
                "test_id": test["id"],
                "category": test["category"],
                "question": test["question"],
                "answer": answer,
                "answer_length": len(answer),
                "length_ok": length_ok,
                "timestamp": datetime.now().isoformat()
            }
            
            results.append(result)
            
        except Exception as e:
            results.append({
                "test_id": test["id"],
                "category": test["category"],
                "question": test["question"],
                "error": str(e),
                "timestamp": datetime.now().isoformat()
            })
    
    return results

# テストを実行
print("=== テストスイート実行開始 ===\n")
results = run_test_suite(test_suite, model_name)

# 結果をJSONファイルに保存
output_file = f"evaluation_results_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(output_file, "w", encoding="utf-8") as f:
    json.dump(results, f, ensure_ascii=False, indent=2)

print(f"\n✅ テスト完了！結果を {output_file} に保存しました")

# サマリーを表示
success_count = sum(1 for r in results if "error" not in r)
print(f"\n=== 結果サマリー ===")
print(f"総テスト数: {len(results)}")
print(f"成功: {success_count}")
print(f"失敗: {len(results) - success_count}")

# カテゴリ別集計
categories = {}
for result in results:
    cat = result["category"]
    categories[cat] = categories.get(cat, 0) + 1

print(f"\nカテゴリ別:")
for cat, count in categories.items():
    print(f"  {cat}: {count}件")

トレーシング - AIアプリの動きを可視化

AIアプリが複雑になると、「なぜこの答えが返ってきたの？」「どこで時間がかかってる？」が見えにくくなります。

トレーシングとは？

ひとことで言うと: プログラムの実行過程を記録して可視化する機能です。

例えば:

ユーザー入力 「東京の天気は？」
  ↓
1. 入力を処理（0.1秒）
  ↓
2. AIモデル呼び出し（2.3秒）
  ↓
3. 天気APIを呼び出し（0.5秒）
  ↓
4. 結果を整形（0.1秒）
  ↓
応答 「東京は晴れです」

合計: 3.0秒

これが見えると、「AIモデルの呼び出しに時間がかかってるな」って分かりますよね。

Azure AI Foundryのトレーシング

Azure AI Foundryは自動でトレーシング情報を記録してくれます。

# tracing_basic.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.telemetry import ApplicationInsightsTelemetryClient

load_dotenv()

# プロジェクトクライアントを作成
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

# トレーシングを有効化
# Azure AI Foundryは自動的にトレースを記録します
print("トレーシングが有効です")

# 通常通りAIを呼び出す
response = project.inference.get_chat_completions(
    model=os.getenv("MODEL_DEPLOYMENT_NAME"),
    messages=[
        {"role": "user", "content": "こんにちは！Pythonについて教えてください。"}
    ]
)

print(response.choices[0].message.content)

# Azure AI Foundry ポータルの「トレース」セクションで
# この呼び出しの詳細情報を確認できます
print("\n✅ トレース情報はポータルで確認できます")
print("Azure AI Foundry → プロジェクト → トレース")

カスタムトレーシング

独自のトレース情報を追加することもできます。

# custom_tracing.py
import os
import time
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from opentelemetry import trace

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

# OpenTelemetry トレーサーを取得
tracer = trace.get_tracer(__name__)

def process_user_input(user_input):
    """ユーザー入力を処理"""
    with tracer.start_as_current_span("process_input") as span:
        span.set_attribute("input_length", len(user_input))
        time.sleep(0.1)  # 処理をシミュレート
        return user_input.strip()

def call_ai_model(processed_input):
    """AIモデルを呼び出し"""
    with tracer.start_as_current_span("ai_model_call") as span:
        span.set_attribute("model", os.getenv("MODEL_DEPLOYMENT_NAME"))
        
        response = project.inference.get_chat_completions(
            model=os.getenv("MODEL_DEPLOYMENT_NAME"),
            messages=[{"role": "user", "content": processed_input}]
        )
        
        answer = response.choices[0].message.content
        span.set_attribute("response_length", len(answer))
        return answer

def format_response(answer):
    """応答を整形"""
    with tracer.start_as_current_span("format_response"):
        time.sleep(0.05)  # 処理をシミュレート
        return f"AI: {answer}"

# メイン処理
with tracer.start_as_current_span("main_workflow") as main_span:
    user_input = "Pythonの魅力を教えて"
    main_span.set_attribute("user_id", "user123")
    
    # 各ステップを実行
    processed = process_user_input(user_input)
    answer = call_ai_model(processed)
    formatted = format_response(answer)
    
    print(formatted)

print("\n✅ カスタムトレースが記録されました")

何が記録される？

各処理にかかった時間
カスタム属性（入力の長さ、モデル名など）
エラー情報（発生した場合）
処理の親子関係

トレースの活用例

# trace_analysis.py
import os
import time
from datetime import datetime
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)
model_name = os.getenv("MODEL_DEPLOYMENT_NAME")

def chatbot_with_metrics(user_message):
    """メトリクス付きチャットボット"""
    start_time = time.time()
    
    try:
        # AI呼び出し
        ai_start = time.time()
        response = project.inference.get_chat_completions(
            model=model_name,
            messages=[{"role": "user", "content": user_message}]
        )
        ai_duration = time.time() - ai_start
        
        answer = response.choices[0].message.content
        total_duration = time.time() - start_time
        
        # メトリクスを表示
        metrics = {
            "timestamp": datetime.now().isoformat(),
            "ai_call_duration": f"{ai_duration:.2f}秒",
            "total_duration": f"{total_duration:.2f}秒",
            "input_length": len(user_message),
            "output_length": len(answer),
            "model": model_name
        }
        
        return answer, metrics
        
    except Exception as e:
        return None, {"error": str(e)}

# テスト実行
print("=== パフォーマンステスト ===\n")

test_messages = [
    "こんにちは",
    "Pythonについて200文字で説明してください",
    "人工知能の未来について詳しく教えてください（500文字程度）"
]

for i, msg in enumerate(test_messages, 1):
    print(f"テスト {i}: {msg[:30]}...")
    answer, metrics = chatbot_with_metrics(msg)
    
    if answer:
        print(f"応答: {answer[:50]}...")
        print(f"メトリクス: {metrics}")
    else:
        print(f"エラー: {metrics}")
    print()

複数のAIサービスとの連携

Azure AI Foundryの強みは、様々なAIサービスを組み合わせられることです。

Azure AI ServicesとSDKの連携

# multi_service_integration.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

# Azure AI Foundry SDKで利用可能な主要サービス
services_info = """
Azure AI Foundryから利用できる主なサービス:

1. Azure OpenAI Service
   - GPT-4、GPT-3.5などの言語モデル
   - DALL-E（画像生成）
   - Whisper（音声認識）

2. Content Safety
   - 有害コンテンツの検出
   - PII（個人情報）の検出

3. Azure AI Search
   - 高度な検索機能
   - RAG（検索拡張生成）の実装

4. Document Intelligence
   - PDFやドキュメントの解析
   - テーブルやフォームの抽出

5. Computer Vision
   - 画像分析
   - OCR（文字認識）

6. Speech Service
   - 音声からテキスト
   - テキストから音声
"""

print(services_info)

# これらのサービスは同じプロジェクトエンドポイントから利用可能
print("\n✅ 全てのサービスが1つのプロジェクトで利用できます")

RAG（検索拡張生成）の簡単な例

RAGとは？

Retrieval-Augmented Generation の略
独自のデータを検索して、その結果を元にAIが回答を生成する仕組み
例: 社内ドキュメントを検索して、AIが答えてくれる

# simple_rag.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

# 独自のデータ（実際にはデータベースやファイルから取得）
company_knowledge = """
【社内規定集】

1. 勤務時間
   - 開始: 9:00
   - 終了: 18:00
   - 休憩: 12:00-13:00

2. リモートワーク
   - 週3日まで可能
   - 事前申請が必要

3. 有給休暇
   - 年間20日付与
   - 取得は前日までに申請
"""

def rag_chat(user_question, knowledge_base):
    """RAG風のチャット（シンプル版）"""
    # プロンプトに独自データを含める
    prompt = f"""
以下の社内情報を参照して質問に答えてください。

【参照情報】
{knowledge_base}

【質問】
{user_question}

【回答ルール】
- 参照情報に基づいて正確に答えること
- 情報にない内容は「情報にありません」と答えること
"""
    
    response = project.inference.get_chat_completions(
        model=os.getenv("MODEL_DEPLOYMENT_NAME"),
        messages=[
            {"role": "system", "content": "あなたは社内規定に詳しいアシスタントです。"},
            {"role": "user", "content": prompt}
        ]
    )
    
    return response.choices[0].message.content

# テスト
questions = [
    "リモートワークは週何日できますか？",
    "勤務開始時刻は？",
    "ボーナスはいつ支給されますか？"  # 情報にない質問
]

print("=== RAGチャットボットのテスト ===\n")
for q in questions:
    print(f"質問: {q}")
    answer = rag_chat(q, company_knowledge)
    print(f"回答: {answer}\n")

本番環境へのデプロイ準備

実際にユーザーに公開する前に、準備すべきことをチェックしましょう。

セキュリティのチェックリスト

# security_checklist.py

security_checklist = """
=== セキュリティチェックリスト ===

1. 認証・認可
   ✅ Azure AD認証を使用
   ✅ APIキーは環境変数に保存
   ✅ .envファイルを.gitignoreに追加
   ✅ 本番環境ではManaged Identityを使用

2. データ保護
   ✅ 個人情報（PII）を適切に処理
   ✅ ログに機密情報を含めない
   ✅ 通信はHTTPSを使用

3. レート制限
   ✅ リトライロジックを実装
   ✅ バックオフ戦略を使用
   ✅ ユーザーごとの制限を設定

4. エラーハンドリング
   ✅ 詳細なエラー情報はユーザーに見せない
   ✅ エラーは適切にログに記録
   ✅ フォールバック処理を実装

5. コンテンツフィルタリング
   ✅ Azure AI Content Safetyを使用
   ✅ 入力と出力の両方をチェック
   ✅ 有害コンテンツは即座にブロック

6. 監視とロギング
   ✅ Application Insightsで監視
   ✅ 異常検知のアラートを設定
   ✅ 定期的にログをレビュー
"""

print(security_checklist)

本番環境用の設定

# production_config.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

# 環境の判定
ENVIRONMENT = os.getenv("ENVIRONMENT", "development")

# 環境ごとの設定
config = {
    "development": {
        "debug": True,
        "log_level": "DEBUG",
        "temperature": 0.7,
        "max_retries": 3,
        "timeout": 30
    },
    "staging": {
        "debug": True,
        "log_level": "INFO",
        "temperature": 0.7,
        "max_retries": 5,
        "timeout": 60
    },
    "production": {
        "debug": False,
        "log_level": "WARNING",
        "temperature": 0.5,  # 本番は安定性重視
        "max_retries": 5,
        "timeout": 60
    }
}

current_config = config.get(ENVIRONMENT, config["development"])

print(f"=== 現在の環境: {ENVIRONMENT} ===")
print(f"設定: {current_config}")

# 設定を使用してクライアントを作成
load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

def get_chat_completion_with_config(messages):
    """設定を考慮したチャット"""
    return project.inference.get_chat_completions(
        model=os.getenv("MODEL_DEPLOYMENT_NAME"),
        messages=messages,
        temperature=current_config["temperature"]
    )

# テスト
if current_config["debug"]:
    print("\nデバッグモードでテスト実行...")
    response = get_chat_completion_with_config([
        {"role": "user", "content": "こんにちは"}
    ])
    print(response.choices[0].message.content)

パフォーマンス最適化

# performance_optimization.py
import os
import time
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from functools import lru_cache

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)
model_name = os.getenv("MODEL_DEPLOYMENT_NAME")

# キャッシュ機能（同じ質問には同じ答えを返す）
@lru_cache(maxsize=100)
def cached_chat(question):
    """キャッシュ付きチャット"""
    response = project.inference.get_chat_completions(
        model=model_name,
        messages=[{"role": "user", "content": question}],
        temperature=0.0  # キャッシュ用なので確定的に
    )
    return response.choices[0].message.content

# パフォーマンステスト
print("=== パフォーマンス比較 ===\n")

test_question = "Pythonとは何ですか？"

# 1回目（キャッシュなし）
start = time.time()
answer1 = cached_chat(test_question)
time1 = time.time() - start
print(f"1回目: {time1:.2f}秒")

# 2回目（キャッシュあり）
start = time.time()
answer2 = cached_chat(test_question)
time2 = time.time() - start
print(f"2回目: {time2:.2f}秒")

print(f"\n高速化: {(time1 - time2) / time1 * 100:.1f}%")
print(f"\n✅ キャッシュを使うと、同じ質問への応答が高速になります")

ベストプラクティス集

1. プロンプトエンジニアリング

# prompt_engineering.py

# ❌ 悪い例
bad_prompt = "Pythonについて教えて"

# ✅ 良い例
good_prompt = """
プログラミング言語「Python」について、以下の点を含めて初心者向けに説明してください:

1. Pythonの特徴（3つ）
2. 主な用途
3. 学習を始めるメリット

200文字程度で、わかりやすくお願いします。
"""

# ベストプラクティス:
# - 具体的な指示を出す
# - 期待する形式を明示
# - 文字数や項目数を指定
# - 対象者（初心者など）を明確に

2. エラーメッセージの設計

# error_messages.py

# ❌ 悪い例
def bad_error_handling():
    try:
        # 何か処理
        pass
    except Exception as e:
        print(f"エラー: {e}")  # 技術的な詳細をそのまま表示

# ✅ 良い例
def good_error_handling():
    try:
        # 何か処理
        pass
    except HttpResponseError as e:
        if e.status_code == 429:
            user_message = "ただいまアクセスが集中しています。少々お待ちください。"
        elif e.status_code == 500:
            user_message = "サーバーで問題が発生しました。しばらくしてから再度お試しください。"
        else:
            user_message = "問題が発生しました。時間をおいて再度お試しください。"
        
        print(user_message)  # ユーザーフレンドリーなメッセージ
        # 技術的な詳細はログに記録（ユーザーには見せない）
        logger.error(f"HTTPエラー: {e.status_code}, 詳細: {e}")

# ポイント:
# - ユーザーには分かりやすいメッセージ
# - 技術的な詳細はログに記録
# - 次のアクションを示唆

3. レスポンスの検証

# response_validation.py
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

load_dotenv()
project = AIProjectClient(
    endpoint=os.getenv("PROJECT_ENDPOINT"),
    credential=DefaultAzureCredential()
)

def validate_response(response_text, max_length=1000, min_length=10):
    """応答を検証"""
    issues = []
    
    # 長さチェック
    if len(response_text) < min_length:
        issues.append("応答が短すぎます")
    if len(response_text) > max_length:
        issues.append("応答が長すぎます")
    
    # 禁止ワードチェック（例）
    forbidden_words = ["申し訳ございません、それについては", "AI言語モデルとして"]
    for word in forbidden_words:
        if word in response_text:
            issues.append(f"不適切なフレーズ: {word}")
    
    # 空の応答チェック
    if not response_text.strip():
        issues.append("応答が空です")
    
    return len(issues) == 0, issues

# 使用例
def safe_chat(user_message):
    """検証付きチャット"""
    response = project.inference.get_chat_completions(
        model=os.getenv("MODEL_DEPLOYMENT_NAME"),
        messages=[{"role": "user", "content": user_message}]
    )
    
    answer = response.choices[0].message.content
    
    # 検証
    is_valid, issues = validate_response(answer)
    
    if not is_valid:
        print(f"⚠️ 応答に問題があります: {issues}")
        # フォールバック処理
        return "申し訳ございません。もう一度お試しください。"
    
    return answer

コスト最適化のヒント

# cost_optimization.py

cost_tips = """
=== コスト最適化のヒント ===

1. モデル選択
   💰 gpt-35-turbo: 最もコスト効率が良い
   💰 gpt-4o-mini: バランス型
   💰💰 gpt-4o: 高性能だが高価

   戦略: 最初はminiで試して、必要なときだけ上位モデル

2. トークン管理
   - max_tokensを適切に設定（無駄に長い応答を防ぐ）
   - 会話履歴を適度に切り詰める
   - プロンプトを簡潔に

3. キャッシング
   - 同じ質問にはキャッシュを使う
   - よくある質問はFAQとして用意

4. バッチ処理
   - 可能なら複数のリクエストをまとめる
   - 非リアルタイム処理は夜間に実行

5. 監視
   - 使用量を定期的にチェック
   - 予算アラートを設定
   - 異常な使用パターンを検出

実装例:
"""

print(cost_tips)

# トークン管理の例
def trim_conversation_history(history, max_messages=10):
    """会話履歴を制限"""
    # システムメッセージは保持
    system_messages = [m for m in history if m["role"] == "system"]
    other_messages = [m for m in history if m["role"] != "system"]
    
    # 最新のN件だけ保持
    if len(other_messages) > max_messages:
        other_messages = other_messages[-max_messages:]
    
    return system_messages + other_messages

# 使用量モニタリングの例
def track_usage(response):
    """トークン使用量を記録"""
    if hasattr(response, 'usage'):
        usage = response.usage
        print(f"使用トークン数:")
        print(f"  プロンプト: {usage.prompt_tokens}")
        print(f"  補完: {usage.completion_tokens}")
        print(f"  合計: {usage.total_tokens}")

Part 3 のまとめ

お疲れさまでした！これで3部作が完結です。

今回マスターしたこと:

✅ AIアプリの評価方法（AI-as-a-Judge）
✅ トレーシングで動作を可視化
✅ 複数のAIサービスとの連携
✅ 本番環境へのデプロイ準備
✅ セキュリティとベストプラクティス
✅ コスト最適化のテクニック

シリーズ全体の振り返り:

Part 1: 基礎と環境構築

SDKの概念を理解
Azureプロジェクトの作成
初めての接続

Part 2: 実装とインタラクション

AIとの対話
ストリーミング
パラメータ調整
エラーハンドリング

Part 3（本記事）: 品質と運用

評価とテスト
トレーシング
本番デプロイ
ベストプラクティス

次のステップ

これからAIアプリ開発者として成長するために:

小さく始める
- まずは簡単なチャットボットから
- 徐々に機能を追加
継続的に学ぶ
- Azureの最新情報をチェック
- コミュニティに参加
実践で磨く
- 実際のプロジェクトで使ってみる
- ユーザーフィードバックを取り入れる
コミュニティに貢献
- 学んだことをブログに書く
- オープンソースに貢献

おすすめリソース:

最後に

この3部作を最後まで読んでくださり、ありがとうございました！

Azure AI Foundry SDKを使えば、誰でもプロフェッショナルなAIアプリケーションを作れます。最初は難しく感じたかもしれませんが、一歩ずつ進めば必ず形になります。

Azure AI Foundry SDKを始めよう！- Part 1: 基礎からわかる環境構築編
 Azure AI Foundry SDKを始めよう！- Part 2: AIモデルと対話してみよう
あなたならどんなAIアプリを作りますか？

楽しいAI開発ライフを！🚀

Happy coding! 🎉

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up