2
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

Azure Text to Speech REST API をやーる(Python 3.6.9)

Posted at

AzureのCognitiveServicesを用いて、テキストから音声変換をやってみました。

はじめに

Azure Portalにログインして、リソースの作成から「音声」を検索し作成してください。
リソースグループ名、エンドポイントおよびsubscription keyを用いるのでコピペしてください。

サンプルコード

'YOUR_RESOURCE_NAME'を作成したリソースグループ名に変更してください。
"YOUR_KEY_HERE"を作成したsubscription keyに変更してください。
fetch_token_urlやbase_urlを作成したリージョンのエンドポイントに合わせて変更してください。

import os
import requests
import time
from xml.etree import ElementTree

try:
    input = raw_input
except NameError:
    pass

class TextToSpeech(object):
    def __init__(self, subscription_key):
        self.subscription_key = subscription_key
        self.tts = input("What would you like to convert to speech: ")
        self.timestr = time.strftime("%Y%m%d-%H%M")
        self.access_token = None

    def get_token(self):
        fetch_token_url = 'https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken'
        headers = {
            'Ocp-Apim-Subscription-Key': self.subscription_key
        }
        response = requests.post(fetch_token_url, headers=headers)
        self.access_token = str(response.text)
        # return self.access_token

    def save_audio(self):
        base_url = 'https://westus.tts.speech.microsoft.com/'
        path = 'cognitiveservices/v1'
        constructed_url = base_url + path
        headers = {
            'Authorization': 'Bearer ' + self.access_token,
            'Content-Type': 'application/ssml+xml',
            'X-Microsoft-OutputFormat': 'riff-24khz-16bit-mono-pcm',
            'User-Agent': 'speechsdk'
        }
        xml_body = ElementTree.Element('speak', version='1.0')
        xml_body.set('{http://www.w3.org/XML/1998/namespace}lang', 'en-us')
        voice = ElementTree.SubElement(xml_body, 'voice')
        voice.set('{http://www.w3.org/XML/1998/namespace}lang', 'en-US')
        voice.set(
            'name', 'Microsoft Server Speech Text to Speech Voice (en-US, Guy24KRUS)')
        voice.text = self.tts
        body = ElementTree.tostring(xml_body)

        response = requests.post(constructed_url, headers=headers, data=body)
        if response.status_code == 200:
            with open('sample-' + self.timestr + '.wav', 'wb') as audio:
                audio.write(response.content)
                print("\nStatus code: " + str(response.status_code) +
                      "\nYour TTS is ready for playback.\n")
        else:
            print("\nStatus code: " + str(response.status_code) +
                  "\nSomething went wrong. Check your subscription key and headers.\n")

if __name__ == "__main__":
    subscription_key = "YOUR_KEY_HERE"
    app = TextToSpeech(subscription_key)
    app.get_token()
    app.save_audio()

結果

下記のように表示されれば成功です。フォルダの中にsample-yyyymmdd-hhmm.wavの音声ファイルが生成されているはずです。

(py36) D:\User\s-fujimoto\sts>python tts.py
What would you like to convert to speech: hello world

Status code: 200
Your TTS is ready for playback.

参考文献

クイック スタート: Python を使用してテキストを音声に変換する

2
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?