Pythonで音声認識

Python

Last updated at 2016-12-06Posted at 2016-12-06

はじめに

私は常々プログラミングやCG制作などの敷居を下げたいと考えているのだけれど、Mayaも音声認識で動かしたいのでコツコツ調査してます。

この記事ではMayaを操作するところまで達成してません。

Windowsに標準で入っている、Microsoft Speech APIの単なるテストです。

手順

Windows10
Python 2.7.12
pywin32==220

python -m virtualenv venv
venv\Scripts\activate.bat
win32comをインストール
もしかしたらSpeechSDKをインストールしないといけないかも
python venv\Lib\site-packages\win32com\client\makepy.pyでMicrosoft Speech Object Library 5.4をpythonコードに変換

以下のプログラムを実行すると、「準備完了」というメッセージとともに音声入力待ちとなり、「ワン」「ツー」「スリー」「フォー」「こんにちわ」という声に反応して標準出力に文字列が出力されます。

main.py

# !/usr/bin/env python
# coding=utf-8

from __future__ import absolute_import, division, print_function

from win32com.client import constants
import win32com.client
import pythoncom


class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    u"""フレーズが認識された際に呼び出されるコールバッククラス

    予め単語を登録しておく。
    """
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)
        print(u"You said: {0}".format(newResult.PhraseInfo.GetText()))


class SpeechRecognition(object):
    def __init__(self, wordsToAdd):
        u"""各種初期化"""
        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")

        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
        self.context = self.listener.CreateRecoContext()
        self.grammar = self.context.CreateGrammar()
        self.grammar.DictationSetState(0)

        self.wordsRule = self.grammar.Rules.Add(
            "wordsRule",
            constants.SRATopLevel + constants.SRADynamic, 0)
        self.wordsRule.Clear()
        for word in wordsToAdd:
            self.wordsRule.InitialState.AddWordTransition(None, word)
        self.grammar.Rules.Commit()
        self.grammar.CmdSetRuleState("wordsRule", 1)
        self.grammar.Rules.Commit()
        self.eventHandler = ContextEvents(self.context)

        self.say(u"じゅんびかんりょう")

    def say(self, phrase):
        u"""しゃべらせる"""
        self.speaker.Speak(phrase)


if __name__ == '__main__':
    wordsToAdd = ["One", "Two", "Three", "Four", u"こんにちわ"]
    speechReco = SpeechRecognition(wordsToAdd)
    while True:
        pythoncom.PumpWaitingMessages()

Maya関係なくてすいません。

参考

Python + pywin32 で COM 叩いてしゃべらせる。
Python: win32com.client.getevents(“SAPI.SpSharedRecoContext”) returns None

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up