More than 5 years have passed since last update.

Azure Speech to Text REST API をやーる（Python 3.6.9）

Last updated at 2019-09-30Posted at 2019-09-30

AzureのCognitiveServicesを用いて、音声からテキスト変換をやってみました。

はじめに

Azure Portalにログインして、リソースの作成から「音声」を検索し作成してください。
subscription keyを用いるのでコピペしてください。

サンプルコード

STTのエンドポイントはこちらにあるので作成したリソースと同じリージョンのエンドポイントを選んでください。

音声ファイル（konnichiwa.wav）はこちらからダウンロードしてください。

import requests

endpoint = "https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=ja-JP&format=detailed"
headers = {
    "Content-Type" : "audio/wav",
    "Ocp-Apim-Subscription-Key": "<your subscription key>",
}
response = requests.post(endpoint, headers=headers, data=open("konnichiwa.wav", "rb"))
print(response.text)

結果

下記のように表示されれば成功です。

(py36) D:\User\s-fujimoto\sts>python stt.py
{"RecognitionStatus":"Success","Offset":1400000,"Duration":48400000,"NBest":[{"Confidence":1.0,"Lexical":"/こんにちは/こんにちは/コンニチハ","ITN":"こんにちは","MaskedITN":"こんにちは","Display":"こんにちは。"}]}

参考文献

Speech to Text REST API

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up