RecordSoundBoxを少し変更してGoogleSpeechAPIv2で指定されている音声フォーマットのファイルを作成する #Pepper

概要

GoogleSpeechAPIv2で使用するための音声ファイルをRecordSoundBoxを利用して生成した際のメモ

ちなみにターミナル等で音声ファイルを生成したいときは以下

soxインストール
brew install sox

recコマンドで録音
rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav

変更箇所

RecordSoundBoxの中で使用されているALAudioDeviceをALAudioRecorderに置き換える。

MyClass.py

class MyClass(GeneratedClass):
    def __init__(self):
        GeneratedClass.__init__(self, False)
        try:
            #self.ad = ALProxy("ALAudioDevice")
            self.audioRecorder = ALProxy("ALAudioRecorder")
        except Exception as e:
            #self.ad = None
            self.audioRecorder = None
            self.logger.error(e)
        self.filepath = ""

    def onLoad(self):
        self.bIsRecording = False
        self.bIsRunning = False

    def onUnload(self):
        self.bIsRunning = False
        if( self.bIsRecording ):
            #self.ad.stopMicrophonesRecording()
            self.audioRecorder.stopMicrophonesRecording()
            self.bIsRecording = False

    def onInput_onStart(self, p):
        if(self.bIsRunning):
            return
        self.bIsRunning = True
        sExtension = self.toExtension( self.getParameter("Microphones used") )
        self.filepath = p + sExtension
        self.log(self.filepath)
        #if self.ad:
        if self.audioRecorder:
            self.log('audioRecorder created!!!')
            channnels = [0,0,1,0]
            self.audioRecorder.startMicrophonesRecording(self.filepath,'wav', 16000, channnels)
            #self.ad.startMicrophonesRecording( self.filepath )
            self.bIsRecording = True
        else:
            self.logger.warning("No sound recorded")

    def onInput_onStop(self):
        if( self.bIsRunning ):
            self.onUnload()
            self.onStopped(self.filepath)

    def toExtension(self, sMicrophones):
        if( sMicrophones == "Front head microphone only (.ogg)" ):
            return ".ogg"
        else:

実際に利用できるかチェック

sshで接続して/tmp以下に生成された音声ファイルを/home/naoの適当な場所にコピー
コレグラフの接続→上級向け→ファイルの転送を使ってダウンロード(FTPクライアントはお好みで)
こちらのRubyスクリプトを利用させていただき、speechAPIを叩いて確認(キー等は取得済みとする)

% ruby ./speech_api_example.rb recordings.wav
{"result"=>[]}
{"result"=>[{"alternative"=>[{"transcript"=>"アップフロンティア", "confidence"=>0.95207101}, {"transcript"=>"アップフロント"}, {"transcript"=>"フロンティア"}, {"transcript"=>"frontier"}, {"transcript"=>"アップフロントよ"}], "final"=>true}], "result_index"=>0}

いけました。