More than 5 years have passed since last update.

IBM Watson Speech to Text API を使ってみた (Python)

Last updated at 2020-03-18Posted at 2020-03-18

経緯

PythonからIBMのAI、Watsonに音声データを送信して音声認識( Speech to Text )をしていきます。
API 初心者向けです。

環境

Macbook: MacOS Mojave
Python: Python 3.8.0

前提

Python インストール済
IBM Cloud アカウントを取得済 (アカウント作成はこちらから。)

インストール

ターミナル上、または python notebook 上で以下のコマンドを実行。

$ pip install "watson-developer-cloud>=1.4.0"

資格情報の確認

IBM Cloud のサイトで Speech To Text を選択すると、マイアカウントから資格情報を確認出来るようになります。
API 鍵と URL をコピーし、以下のコード(こちらのコードを一部改変)にペースト。

from watson_developer_cloud import SpeechToTextV1
import json

# define
apikey = "[API鍵]"
url = "[URL]"
audio_file = open("voice.wav", "rb")
cont_type = "audio/wav"
lang = "ja-JP_BroadbandModel"

# watson connection
stt = SpeechToTextV1(iam_apikey=apikey,url=url)
result_json = stt.recognize(audio=audio_file, content_type=cont_type, model=lang, timestamps = False)

# print
sttResult = result_json.get_result()
print(sttResult)

# json file save
result = json.dumps(result_json, indent=2)
f = open("result.json", "w")
f.write(result)
f.close()

txt ファイルで本文だけ取り出して保存したい時はこちら

# txt file save
with open('result.txt', 'w') as f:
  raw_text=[]
  for i in range(len(sttResult["results"])):
      txt = sttResult["results"][i]["alternatives"][0]["transcript"]
      raw_text.append(txt)
  for i in range(len(raw_text)):
    f.write(raw_text[i])

以上！

まとめ

認識精度は GCP 等の方がいいみたいですが、他サービスと比較して無料枠が大きく色々遊べる Watson もぜひ使ってみてください！

参考・引用元

IBM Cloud 公式
 IBM Cloud API Docs / Speech to Text
[Python]WatsonのSpeech To Textを使うお話

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up