More than 5 years have passed since last update.

SpeechRecognizerを使用して、特定の言葉に対して反応するようにしてみた

Last updated at 2019-07-21Posted at 2019-07-21

やりたいこと

Android の SpeechRecognizer を使用して、特定の言葉を端末に喋りかけると対応する言葉を Toast で表示するようにする。😳📲💬

開発環境

macOS Mojave 10.14.5
Android Studio 3.4.1

主な登場人物

SpeechRecognizer
RecognitionListener
RecognizerIntent

####SpeechRecognizer
端末の音声認識機能にアクセスするためのクラス。
インスタンスを生成する際は SpeechRecognizer が提供している SpeechRecognizer#createSpeechRecognizer(Context) を使用します。
さらに Manifest.permission.RECORD_AUDIO というパーミッションも使用します。
アプリのメインスレッド上で動かす！！
####RecognitionListener
音声入力の結果や始まり・終わりなどのレスポンスを受け取るために使用されるリスナー。
アプリのメインスレッド上で動かす！！
####RecognizerIntent
音声認識を開始するためのインテントを作成するための定数がまとまっているクラス。

実装方法

AndroidManifest.xmlにインターネットとマイクアクセスのパーミッションを追加する。(SpeechRecognizerはインターネットを使用するため)
音声認識のためのパーミッションチェック。
SpeechRecognizer のインスタンスを createSpeechRecognizer(Context) で生成し、 RecognitionListener を登録する。
SpeechRecognizer 用の Intent を生成。
SpeechRecognizer の startListening(Intent) に先ほど生成した SpeechRecognizer 用の Intent を渡し音声認識を開始する。
stopListening() で音声認識を止める。
onResultsから取得した値で判定し、特定の言葉を表示する

以下は自分が実装したコードの全文です。

<!-- 1. AndroidManifest.xmlにインターネットとマイクアクセスのパーミッションを追加する。(SpeechRecognizerはインターネットを使用するため) -->
<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.RECORD_AUDIO"/>


<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout
        xmlns:android="http://schemas.android.com/apk/res/android"
        xmlns:tools="http://schemas.android.com/tools"
        xmlns:app="http://schemas.android.com/apk/res-auto"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        tools:context=".MainActivity">

    <!-- 音声認識のスタートとストップを兼ねるボタン -->
    <Button
            android:id="@+id/startStopButton"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="スタート"
            app:layout_constraintBottom_toBottomOf="parent"
            app:layout_constraintLeft_toLeftOf="parent"
            app:layout_constraintRight_toRightOf="parent"
            app:layout_constraintTop_toTopOf="parent"/>

</androidx.constraintlayout.widget.ConstraintLayout>

class MainActivity : AppCompatActivity(), SimpleRecognizerListener.SimpleRecognizerResponseListener {

    private lateinit var speechRecognizer: SpeechRecognizer
    private lateinit var recognizerIntent: Intent

    // true → スタート状態, false → ストップ状態
    private var speechState = false

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        // 2. 音声認識のためのパーミッションチェック。
        PermissionChecker(this)

        // 3. SpeechRecognizer のインスタンスを createSpeechRecognizer(Context) で生成し、 RecognitionListener を登録する。
        setupSpeechRecognizer()

        // 4. SpeechRecognizer 用の Intent を生成。
        setupRecognizerIntent()

        startButton.setOnClickListener {
            if (speechState) {
                startButton.text = "スタート"
                stopListening()
            } else {
                startButton.text = "ストップ"
                startListening()
            }
        }
    }

    private fun setupSpeechRecognizer() {
        // SpeechRecognizer のインスタンスを生成
        speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
        speechRecognizer.setRecognitionListener(SimpleRecognizerListener(this))
    }

    private fun setupRecognizerIntent() {
        // Recognizer のレスポンスを取得するための Intent を生成する
        recognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, packageName)
    }

    private fun startListening() {
        speechState = true
        // 5. SpeechRecognizer の __startListening(Intent)__ に先ほど生成した SpeechRecognizer 用の Intent を渡し音声認識を開始する。
        speechRecognizer.startListening(recognizerIntent)
    }

    private fun stopListening() {
        speechState = false
        // 6. __stopListening()__ で音声認識を止める。
        speechRecognizer.stopListening()
    }
    
    // 7. onResultsから取得した値で判定し、特定の言葉を表示する
    override fun onResultsResponse(speechText: String) {
        // ここは適宜任意の文字列に変えてください
        if (speechText == "おはよう") {
            Toast.makeText(this, "ございます！", Toast.LENGTH_SHORT).show()
        } else {
            Toast.makeText(this, speechText, Toast.LENGTH_SHORT).show()
        }
    }
}

class SimpleRecognizerListener(private val listener: SimpleRecognizerResponseListener)
    : RecognitionListener {

    interface SimpleRecognizerResponseListener {
        fun onResultsResponse(speechText: String)
    }

    override fun onReadyForSpeech(p0: Bundle?) {

    }

    override fun onRmsChanged(p0: Float) {
    }

    override fun onBufferReceived(p0: ByteArray?) {
    }

    override fun onPartialResults(p0: Bundle?) {
    }

    override fun onEvent(p0: Int, p1: Bundle?) {
    }

    override fun onBeginningOfSpeech() {
    }

    override fun onEndOfSpeech() {
    }

    override fun onError(p0: Int) {
    }
    
    // 7. onResultsから取得した値で判定し、特定の言葉を表示する
    override fun onResults(bundle: Bundle?) {
        if (bundle == null) {
            listener.onResultsResponse("")
            return
        }

        val key = SpeechRecognizer.RESULTS_RECOGNITION
        val result = bundle.getStringArrayList(key)
        // なぜかスペースが入力されてしまう時があったので、スペースがあった場合は取り除くようにした。
        val speechText = result?.get(0)?.replace("\\s".toRegex(), "")

        if (speechText.isNullOrEmpty()) {
            listener.onResultsResponse("")
        } else {
            listener.onResultsResponse(speechText)
        }
    }
}

####1. AndroidManifest.xmlにインターネットとマイクアクセスのパーミッションを追加する。(SpeechRecognizerはインターネットを使用するため)

<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.RECORD_AUDIO"/>

インターネットへの接続許可、端末のマイクへのアクセス許可を行います！

####2. 音声認識のためのパーミッションチェック。
こちらの解説は省かせていただきますが、今回はサンプルということでかなり簡単に以下のように実装しました。
自分はクラスに切り出して init{} の中に書いてしまったのですが、onCreateの中にベタ書きしても動くと思います。

    val recordAudioPermission = android.Manifest.permission.RECORD_AUDIO
    val currentPermissionState = ContextCompat.checkSelfPermission(context, recordAudioPermission)
    if (currentPermissionState != PackageManager.PERMISSION_GRANTED) {
        if (ActivityCompat.shouldShowRequestPermissionRationale(context as Activity, recordAudioPermission)) {
            // 拒否した場合
            permissionState = false
        } else {
            // 許可した場合
            ActivityCompat.requestPermissions(context, arrayOf(recordAudioPermission), 1)
            permissionState = true
        }
    }

また詳しい解説に関しては公式リファレンスを参照していただければと思います。🙇‍♂️
実行時のパーミッションリクエスト

####3. SpeechRecognizer のインスタンスを createSpeechRecognizer(Context) で生成し、 RecognitionListener を登録する。

   // SpeechRecognizer のインスタンスを生成する
   speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
   speechRecognizer.setRecognitionListener(SimpleRecognizerListener(this))

   class SimpleRecognizerListener(private val listener: SimpleRecognizerResponseListener)
       : RecognitionListener {
       // onResultsから値を取得するための独自リスナー
       interface SimpleRecognizerResponseListener {
           fun onResultsResponse(speechText: String)
       }

       // めっちゃ中略...

       // 最終的に端末に向かって話した言葉は onResults を通じて取得できる
       override fun onResults(bundle: Bundle?) {
           // bundleから値を取得する
     }
}

####4. SpeechRecognizer 用の Intent を生成。

   // Recognizer のレスポンスを取得するための Intent を生成する
   recognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
   recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
   recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, packageName)

####5. SpeechRecognizer の startListening(Intent) に先ほど生成した SpeechRecognizer 用の Intent を渡す。

    speechRecognizer.startListening(recognizerIntent)

これで音声入力が開始されます！

####6. 音声認識を stopListening() で音声認識を止める。

    speechRecognizer.stopListening()

今回はスタートした後のボタンをタップすると上記の stopListening() が呼ばれるようになっています！

####7. onResultsから取得した値で判定し、特定の言葉を表示する

    // RecognitionListener の中の onResults
    override fun onResults(bundle: Bundle?) {
        if (bundle == null) {
            listener.onResultsResponse("")
            return
        }
        
        // Bundle から検出した言葉を取り出すためのキー
        val key = SpeechRecognizer.RESULTS_RECOGNITION
        val result = bundle.getStringArrayList(key)
        val speechText = result?.get(0)?.replace("\\s".toRegex(), "")

        if (speechText.isNullOrEmpty()) {
            listener.onResultsResponse("")
        } else {
            // 独自リスナーに言葉を渡す
            listener.onResultsResponse(speechText)
        }
    }

    // 自分前で実装した独自リスナー
    override fun onResultsResponse(speechText: String) {
        if (speechText == "おはよう") {
            Toast.makeText(this, "ございます！", Toast.LENGTH_SHORT).show()
        } else {
            Toast.makeText(this, speechText, Toast.LENGTH_SHORT).show()
        }
    }

一応、ここまでやると以下のようにダイアログが表示されると思います!

終わりに

結構長めの記事を初めて書いてみたのですが、なかなか時間がかかってしまいました。。。
SpeechRecognizer をうまく活用できれば、音声検索機能などが簡単に実装できそうだなぁと感じました。
次は RecognitionListener の詳しい中身を書いてみようと思います。
みていただきありがとうございました！☺️

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up