More than 5 years have passed since last update.

Web Speech API Speech Recognition で数値の認識を統一する

Posted at 2019-11-13

概要

https://qiita.com/hmmrjn/items/4b77a86030ed0071f548
こちらの記事などを参考に音声認識アプリケーションを作っていました。
「10キロ」「20キロ」など数値を入力する際に、数字と漢数字が混ざって扱いづらいのでどうにかしたい。

再現手順

以下のデモサイトにアクセス

https://www.google.com/intl/ja/chrome/demos/speech.html

日本語を選択して、「10キロ、20キロ、30キロ、…」としゃべっていく。
テキスト化された内容を確認すると数字と漢数字が混じっている

60 と 70 が割と怪しい
六十キロ七十キロ
まとめてしゃべると補正するようだが単体だとダメ

解決方法

SpeechGrammar を設定してあげる
- https://developer.mozilla.org/ja/docs/Web/API/SpeechGrammar
JIS ja を指定する必要あり

      // 日本語の数字を単語として登録する
      const grammar =
        '#JSGF V1.0 JIS ja; grammar numbers; public <numbers> = 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 ;'
      const SpeechGrammarList =
        window.webkitSpeechGrammarList || window.SpeechGrammarList
      const speechRecognitionList = new SpeechGrammarList()
      speechRecognitionList.addFromString(grammar, 1)
      recognition.grammars = speechRecognitionList

受け取れるテキストが数字になる
(とりあえず動いたのでオッケーとする)

動作確認サンプルコード

Nuxt.js Vuetify 環境のサンプルコード

recognition-test.vue

<template>
  <v-row justify="center">
    <v-col sm="12" md="11" lg="9" xl="6">
      <v-sheet class="pa-3">
        <h1 id="hoge">音声認識テスト</h1>
        <v-row align="center" justify="center" class="mt-2 mb-10 px-2">
          <v-btn
            outlined
            :disabled="disabled"
            color="iconcolor"
            rounded
            block
            @click="start"
            >{{ text }}</v-btn
          >
        </v-row>
        <v-sheet elevation="6">
          <v-list light>
            <v-list-item>
              <v-list-item-content
                >※ここに認識した音声が表示されます</v-list-item-content
              >
            </v-list-item>
            <v-list-item v-for="(log, i) in logs" :key="i">
              <v-list-item-content>{{ log }}</v-list-item-content>
            </v-list-item>
          </v-list>
        </v-sheet>
      </v-sheet>
    </v-col>
  </v-row>
</template>

<script>
export default {
  data() {
    return {
      text: '音声認識スタート',
      disabled: false,
      logs: []
    }
  },
  methods: {
    start() {
      const SpeechRecognition =
        window.webkitSpeechRecognition || window.SpeechRecognition
      const recognition = new SpeechRecognition()
      recognition.continuous = true

      // 日本語の数字を単語として登録する
      const grammar =
        '#JSGF V1.0 JIS ja; grammar numbers; public <numbers> = 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 ;'
      const SpeechGrammarList =
        window.webkitSpeechGrammarList || window.SpeechGrammarList
      const speechRecognitionList = new SpeechGrammarList()
      speechRecognitionList.addFromString(grammar, 1)
      recognition.grammars = speechRecognitionList

      recognition.onresult = (event) => {
        const inputMessage =
          event.results[event.results.length - 1][0].transcript
        this.logs.push(inputMessage)
      }
      recognition.onend = (event) => {
        // 一定時間入力が無いと終了するので継続する
        recognition.start()
      }

      recognition.start()

      this.text = '何かしゃべってください!'
      this.disabled = true
    }
  }
}
</script>

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up