PythonでAIアシスタントを作ってみる

Last updated at 2022-12-23Posted at 2022-12-23

この記事は、リンクバルアドベントカレンダー2022の23日目の記事です。

はじめに

現在、AIアシスタントは既に新語じゃなかったと思います。AIアシスタントとは、様々な質問やお願いをすると、その音声を認識して答えてくれるAI技術のことです。AIアシスタントと言えば、よく使われているものはアマゾンエコーに搭載されているAmazon AlexaやiPhoneに搭載されているSiriやAndroidに搭載されているGoogle Assistantなどです。それらがどのように運用するかを見て、プログラミング言語でほぼ同じものを作成できないか疑問に思っていました。自分で調べたことに限り、PythonでAIアシスタントを単純で作成してみたので、この記事には紹介します。

実装

主な3つ分を分ける

聞く処理部分
音声処理部分
話す処理部分

聞く処理部分

speech_recognitionのPythonライブラリーを使用して音声を認識する。
もしspeech_recognitionライブラリーをまだインストールしない場合は以下のコマンドを実行しないといけないです。

$ pip install speechrecognition

マイク入力を使用するためにpyaudioのインストールも必要です。

$ pip install pyaudio

上の実装が完了したら聞く処理部分に行きます。以下のコードー部分を使用します。

import speech_recognition
robot_ear = speech_recognition.Recognizer()
with speech_recognition.Microphone() as mic:
  print("ロボット: 聞いています。")
  audio = robot_ear.record(mic, duration = 2)
print("ロボット: ...")
try:
  you = robot_ear.recognize_google(audio, language = 'ja_JP')
except:
  you = "..."
print("自分: " + you)

詳細的にいきましょう。
まず、Recognizerインスタンスを作成してrobot_earにアサインする。そのあと、speech_recognition.Microphone()を呼んで、micのような変数を使います。
外から音声を聞き取るように、robot_ear.listen(mic)を利用できますが、2秒だけ聞かせたいのでrecordメソッドを使って2秒で音声を聞かせるようにしました。

音声処理部分

この段階で、recognize_googleメソッドを使って、聞いた音声からテキストに変換する。日本語に変換したいのでlanguageの引数をja_JPに指定する。変換されたテキストは音声処理部分(AIアシスタントの脳と呼ばれる)で分析して答えくれる。答えはAIアシスタントの話す処理部分のインプットになる。
以下のコード部分を使います。

if "..." in you:
  robot_brain = "聞き取れないのでもう一度話してください。"
elif "こんにちは" in you:
  robot_brain = "こんにちは、ズオン。"
elif "今日" in you:
  robot_brain = datetime.datetime.now().strftime("%B %d, %Y")
elif "天気" in you:
  robot_brain = "寒くて、雪が降りそうです。"
elif "バイバイ" in you:
  robot_brain = "また、ズオン。"
else:
  robot_brain = "はっきり発音してください。"

話す処理部分

最後、話す処理の部分でpyttsx3のPythonライブラリーを使用してテキストを音声に変換する。
もしpyttsx3ライブラリーをまだインストールしない場合は以下のコマンドを実行してください。

$ pip install pyttsx3

話す処理部分で以下のコード部分みたいを利用する。

import pyttsx3

def change_voice(engine, language, gender='VoiceGenderFemale'):
  for voice in engine.getProperty('voices'):
    if language in voice.languages and gender == voice.gender:
      engine.setProperty('voice', voice.id)
      return True
  raise RuntimeError("Language '{}' for gender '{}' not found".format(language, gender))

robot_brain = "いい天気ですから散歩しましょう。"
engine = pyttsx3.init()
change_voice(engine, "ja_JP", "VoiceGenderFemale")
engine.say(robot_brain)
engine.runAndWait()

もう少し詳しく説明します。
まず、AIが話せるようにpyttsx3.init()メソッドを使ってengine変数をアサインする。次、デフォルトの音声言語は英語のため、日本語で話させたいのでchange_voiceというカスタムメソッドを使います。最後、何か話させたいのをsayメソッドの引数になって呼んで、runAndWaitメソットで話し完了したらしばらく待たされるように設定しました。

完全コード

ai_bot.py

import speech_recognition
import pyttsx3
import datetime
import locale

locale.setlocale(locale.LC_TIME, 'ja_JP')

robot_ear = speech_recognition.Recognizer()
robot_mouth = pyttsx3.init()
robot_brain = ""

def change_voice(engine, language, gender='VoiceGenderFemale'):
  for voice in engine.getProperty('voices'):
    if language in voice.languages and gender == voice.gender:
      engine.setProperty('voice', voice.id)
      return True
  raise RuntimeError("Language '{}' for gender '{}' not found".format(language, gender))

while True:
  with speech_recognition.Microphone() as mic:
    print("ロボット: 聞いています。")
    audio = robot_ear.record(mic, duration = 2)

  print("ロボット: ...")
  try:
    you = robot_ear.recognize_google(audio, language='ja_JP')
  except:
    you = "..."
  print("自分: " + you)

  if "..." in you:
    robot_brain = "聞き取れないのでもう一度話してください。"
  elif "こんにちは" in you:
    robot_brain = "こんにちは、ズオン。"
  elif "今日" in you:
    robot_brain = datetime.datetime.now().strftime("%B %d, %Y")
  elif "天気" in you:
    robot_brain = "寒くて、雪が降りそうです。"
  elif "バイバイ" in you:
    robot_brain = "また、ズオン。"
    print("ロボット: " + robot_brain)
    change_voice(robot_mouth, "ja_JP", "VoiceGenderFemale")
    robot_mouth.say(robot_brain)
    robot_mouth.runAndWait()
    break
  else:
    robot_brain = "はっきり発音してください。"

  print("ロボット: " + robot_brain)

  change_voice(robot_mouth, "ja_JP", "VoiceGenderFemale")
  robot_mouth.say(robot_brain)
  robot_mouth.runAndWait()

完全コードでwhile Trueとbreakを使って、終了の単語が聞き取るまでAIアシスタントが連続的で走ることにしています。
試してみて結果を見てみましょうね。

$ python3 ai_bot.py

終わりに

今、上の作ったAIアシスタントは簡単な質問しか答えられませんが、音声処理部の入出力の値を充実させることで今よりスマートにすることができると思います。
ここまで読んでいただきありがとうございます。
バイバイ

参考

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up