More than 1 year has passed since last update.

AI-102 HandsOn Azure OpenAIを勉強する

Posted at 2024-03-26

AI-102の試験のため、以下のGitHubのHandsOnの勉強を実施します。

https://github.com/MicrosoftLearning/mslearn-ai-services
https://github.com/MicrosoftLearning/mslearn-ai-vision
https://github.com/MicrosoftLearning/mslearn-ai-language
https://github.com/MicrosoftLearning/mslearn-ai-document-intelligence
https://github.com/MicrosoftLearning/mslearn-knowledge-mining
https://github.com/MicrosoftLearning/mslearn-openai　→　本記事はこれ

MSLearnのドキュメントはこちら。
Azure OpenAI Service を使用して生成 AI ソリューションを開発する

とはいいつつ、基礎はもういろんなところで出回っているので、自分が知らなかったところを重点的にピックアップします。（正直基礎を触った人には新しいことはないかも。。）

以降Azure OpenAIはAOAIと記載。

基礎の基礎

リソース作成

azコマンドでAOAIリソースを作るのには、az cognitiveservicesらしい。知らなかった。

# アカウント作成
az cognitiveservices account create \
   -n MyOpenAIResource \
   -g OAIResourceGroup \
   -l eastus \
   --kind OpenAI \
   --sku s0 \
   --subscription subscriptionID

# モデルデプロイ
az cognitiveservices account deployment create \
   -g OAIResourceGroup \
   -n MyOpenAIResource \
   --deployment-name MyModel \
   --model-name gpt-35-turbo \
   --model-version "0301"  \
   --model-format OpenAI \
   --sku-name "Standard" \
   --sku-capacity 1

モデルの種類

3.5、4、DALL-Eは別として。

Embedding
text-embedding-ada-002のパワーアップ版としてtext-embedding-3-largeとtext-embedding-3-smallというモデルができている

以下のようなものもモデルと呼ばれるらしい。

Whisper (文字起こし用)
　
Text to speech

CompletionとChat Completion

本当にざっくりとだけど、Completionは一問一答、Chat Completionは履歴を含めた会話と理解した。基本的にはChat Completionを使っていればいい。

OpenAIの方ではCompletionはもうLegacyになっている。AOAIでもGPT-4では使えない。いずれ廃止されるだろう。
https://platform.openai.com/docs/guides/text-generation/completions-api

API

Chat Completion

これが
curl https://YOUR_ENDPOINT_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-03-15-preview \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_API_KEY" \
  -d '{"messages":[{"role": "system", "content": "You are a helpful assistant, teaching people about AI."},
{"role": "user", "content": "Does Azure OpenAI support multiple languages?"},
{"role": "assistant", "content": "Yes, Azure OpenAI supports several languages, and can translate between them."},
{"role": "user", "content": "Do other Azure AI Services support translation too?"}]}'

こうなる
{
    "id": "<id>",
    "object": "text_completion",
    "created": 1679001781,
    "model": "text-davinci-003",
    "choices": [
        {
            "text": "Macbeth",
            "index": 0,
            "logprobs": null,
            "finish_reason": "stop"
        }
    ]
}

応答はchoices[].textに入る。

Embedding

実はOn Your Dataを使っていたので、自分でEmbeddingを実行したことはなかったことに気づいた。こんな感じ。

curl https://YOUR_ENDPOINT_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2022-12-01 \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_API_KEY" \
  -d "{\"input\": \"The food was delicious and the waiter...\"}"

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0172990688066482523,
        -0.0291879814639389515,
        ....
        0.0134544348834753042,
      ],
      "index": 0
    }
  ],
  "model": "text-embedding-ada:002"
}

SDK

SDKで書くとこんな感じ。

dotnet add package Azure.AI.OpenAI --version 1.0.0-beta.14

// Format and send the request to the model
var chatCompletionsOptions = new ChatCompletionsOptions()
{
    Messages =
    {
        new ChatRequestSystemMessage(systemPrompt),
        new ChatRequestUserMessage(userPrompt)
    },
    Temperature = 0.7f,
    MaxTokens = 1000,
    DeploymentName = oaiDeploymentName
};

// Get response from Azure OpenAI
Response<ChatCompletions> response = await client.GetChatCompletionsAsync(chatCompletionsOptions);

ChatCompletions completions = response.Value;
string completion = completions.Choices[0].Message.Content;

Prompt Engineering

知らなかったものをいくつか。

セクションマーカー
文章を---で説明文と対象文で区切るときなどに使う。
```
Translate the text into French
---
What's the weather going to be like today?
---
```

Token節約のために、過去の会話履歴をモデルに要約させる。

Dall-E

curl -X POST https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment-id}/images/generations?api-version=2023-12-01-preview \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_API_KEY" \
  -d '{
    "prompt": "An avocado chair",
    "size": "1024x1024",
    "n": 3,
    "quality": "hd", 
    "style": "vivid"
  }'

Responseは画像のURLが返ってくる。
{ 
    "created": 1698116662, 
    "data": [ 
        { 
            "url": "url to the image", 
            "revised_prompt": "the actual prompt that was used" 
        }, 
        { 
            "url": "url to the image" 
        },
        ...
    ]
}

RAG

注意点として、Excelを読み込ませることはできない。
　
On Your Dataを利用する前段階として、Microsoftがデータの前処理のためのスクリプトを準備してくれている。大きなファイルやPDFをがある場合一見の価値ありかも。
PDFをTextに変換
　
調べてみたら、RAG専用の「入力候補の拡張機能」なんてEndpointも一瞬できてたけど、速攻で廃止されたらしい。ExtensionsなんてEndpointはなかったのだ。

最新はこちらを見るべき。
普通のChatCompletionのエンドポイントにDatasourceを指定することでRAGをAPI経由で利用する。

az rest --method POST \
 --uri $AzureOpenAIEndpoint/openai/deployments/$ChatCompletionsDeploymentName/chat/completions?api-version=2024-02-01 \
 --resource https://cognitiveservices.azure.com/ \
 --body \
'
{
    "data_sources": [
        {
            "type": "azure_search",
            "parameters": {
                "endpoint": "'$SearchEndpoint'",
                "index_name": "'$SearchIndex'",
                "authentication": {
                    "type": "system_assigned_managed_identity"
                }
            }
        }
    ],
    "messages": [
        {
            "role": "user",
            "content": "Who is DRI?"
        },
        {
            "role": "assistant",
            "content": "DRI stands for Directly Responsible Individual of a service. Which service are you asking about?"
        },
        {
            "role": "user",
            "content": "Opinion mining service"
        }
    ]
}
'

責任あるAI

Microsoftのいう責任のあるAIとは。

AIソリューションの潜在的な危害を "特定" します。
生成される出力に、これらの危害が存在するかどうかを "測定" します。
危害を "軽減" して、透明性の高いコミュニケーションを確保します。
デプロイと運用の準備計画を定義し、責任を持ってソリューションを "運用" します。

で、それを考えるための資料やテンプレートをMicrosoftが準備してくれている。教科書的なものだけど、スタートポイントとして利用できそう。

Microsoft Responsible AI Impact Assesment Guide

Microsoft Responsible AI Impact Assesment Template

気を付けるレイヤーとしては以下の4層で考える。
この辺は実際に設計するときにも注意すべきなので、リンクを貼る。

モデル
- 何がなんでもGPTを使えばよいというものではない。単純な出力が欲しいときは単純なモデル（LanguageモデルやSentimentとか）で済ます方が安全。
- モデルの微調整
  　
安全システム
- 出力した結果をContent Safetyとかでフィルタしよう。
- アノマリー検知やモニタリング。
  　
メタプロンプトおよびグラウンディング
- RAGを利用して一般情報を参照しないようにする。
  　
ユーザーエクスペリエンス
- UI層でそもそもユーザーが変なデータを入れられないようにする、入力データをドロップダウンに絞るとか。

コンプライアンス

動けばいい、というものではない。

法的、プライバシー、Security、ユーザー補助の観点から必要な機能が出てくる。

段階的リリースの実装
有害な応答をどう検知するか
検知したらどうサービスをロールバック、もしくは停止するか
ユーザーがどのようにフィードバックを提供するか
等など

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up