Whisper に Azure OpenAI Service を使用して日本語データを読み込ませる

Last updated at 2024-09-26Posted at 2024-09-25

Azure OpenAI Service の Whisper modelって？

Whisper モデルは、OpenAI の音声テキスト変換モデルで、オーディオファイルの文字起こしに使用できます。

公式のドキュメントの説明はこちら

公式ドキュメントのクイックスタート

本記事で使用しているサンプル音声ファイル

本記事の目的

上記公式ドキュメントの説明からは Transcriptions と Translations の使い分けに関して読み取れない為、それぞれの API の使い方についての備忘

実行方法

入力言語のまま文字起こしをした結果を出力する場合

Transcriptions の REST API を使用する

$ curl https://<リソース名>.openai.azure.com/openai/deployments/<デプロイモデル名
>/audio/transcriptions?api-version=2024-06-01 \
-H "api-key: <API key>" \
-H "Content-Type: multipart/form-data" \
-F file="@./audio.wav"

ubuntu 20.04での実行結果

英語にて翻訳した結果を出力する場合

Translations の REST API を使用する

$ curl https://<リソース名>.openai.azure.com/openai/deployments/<デプロイモデル名
>/audio/translations?api-version=2024-06-01 \
-H "api-key: <API key>" \
-H "Content-Type: multipart/form-data" \
-F file="@./audio.wav"

ubuntu 20.04での実行結果

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up