More than 3 years have passed since last update.

pythonで英語の音声認識 [speech to text]

Last updated at 2020-11-15Posted at 2020-08-15

英語を文字起こししよう

オンライン(wifiあり)での認識と、オフライン(wifi無し)での認識をそれぞれ紹介します。

環境

ubuntu 18.04
python3

オンラインでの認識

googleを使います.
環境構築は以下のコマンドでインストールします.

pip3 install SpeechRecognition --user
sudo apt-get install portaudio19-dev
sudo apt-get install python-pyaudio python3-pyaudio
pip3 install pyaudio

動作確認

google_test.py

import speech_recognition as sr  

# get audio from the microphone                                                                       
r = sr.Recognizer()                                                                                   
with sr.Microphone() as source:  
    r.adjust_for_ambient_noise(source)                                                                     
    print("Speak:")                                                                                   
    audio = r.listen(source)   

try:
    print("-----------detect!----------\n",r.recognize_google(audio))
except sr.UnknownValueError:
    print("Could not understand audio")
except sr.RequestError as e:
    print("Could not request results; {0}".format(e))

比較的精度もいいと思います

オフラインでの認識

pocketsphinxを使用します。
環境構築は以下のコマンドでインストールします.

sudo apt-get install -y python python-dev python-pip build-essential swig git libpulse-dev
sudo apt-get install libasound2-dev
git clone https://github.com/cmusphinx/pocketsphinx-python.git
sudo pip install pocketsphinx

動作確認

pocket_test.py

from pocketsphinx import LiveSpeech
for phrase in LiveSpeech():
    print("-----------detect!----------\n",phrase)

文字が出力されれば成功です。

"https://pypi.org/project/pocketsphinx/"
このページに色々なサンプルコードが載っているので参考にしてください.

pocketsphinxで、独自辞書というものを作成し精度を上げる↓(続き)
"https://qiita.com/hir-osechi/items/7d1b100c721f34896a90"

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up