LINEDCAdvent Calendar 2024

Flask + Cloudflare Tunnel でサッと Speech to Text LINE Bot を試す

Last updated at 2024-12-26Posted at 2024-12-25

LINE Bot で音声メッセージを文字起こしして返信する

この記事では、LINE Bot を使用して音声メッセージを受信し、その音声を OpenAI の Whisper モデルで文字起こしして、結果をユーザーに返信する方法を紹介します。

Google Colab を使った実行

このスクリプトは Google Colab から直接試すことができます。

以下のノートブックを開きます：
LINE_Bot_Audio_Transcription.ipynb
必要なキーを設定し、セルを順番に実行します。
Cloudflare Tunnel の公開 URL を LINE Developers コンソールで Webhook URL として設定します。
LINE Bot に音声メッセージを送信して、返信内容を確認します。

概要

LINE Messaging API と Flask を活用して、以下の機能を実装します：

LINE Bot が音声メッセージを受信する。
音声ファイルをサーバーに保存する。
OpenAI の API を使用して音声を文字起こしする。
文字起こし結果をユーザーに返信する。

必要なツール

Python 3.7+
Flask
LINE Messaging API アクセストークン
OpenAI API キー
Cloudflare Tunnel（開発環境用）

スクリプト全体

以下は、実装に使用する完全な Python スクリプトです。

import requests
import os
from flask import Flask, request, jsonify
import threading
import hmac
import hashlib
import base64
import openai

# LINE Bot チャネルシークレットとアクセストークン
CHANNEL_SECRET = "your_channel_secret"  # ここに LINE Developers コンソールで取得したチャネルシークレットを入力
CHANNEL_ACCESS_TOKEN = "your_channel_access_token"  # ここにチャネルアクセストークンを入力

# OpenAI API キー
OPENAI_API_KEY = "your_openai_api_key"  # ここに OpenAI API キーを入力
openai.api_key = OPENAI_API_KEY

# 保存用ディレクトリ
SAVE_DIR = "downloads"
os.makedirs(SAVE_DIR, exist_ok=True)

# Flask アプリケーションの初期化
app = Flask(__name__)

# メッセージ返信用関数
def reply_message(reply_token, message):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {CHANNEL_ACCESS_TOKEN}"
    }
    data = {
        "replyToken": reply_token,
        "messages": [
            {"type": "text", "text": message}
        ]
    }
    requests.post("https://api.line.me/v2/bot/message/reply", headers=headers, json=data)

# リクエストの検証用関数
def verify_signature(signature, body):
    hash = hmac.new(CHANNEL_SECRET.encode('utf-8'), body, hashlib.sha256).digest()
    return base64.b64encode(hash).decode() == signature

# LINE Messaging API からデータを取得する関数
def download_content(message_id, file_name):
    headers = {
        "Authorization": f"Bearer {CHANNEL_ACCESS_TOKEN}"
    }
    url = f"https://api-data.line.me/v2/bot/message/{message_id}/content"
    response = requests.get(url, headers=headers, stream=True)

    if response.status_code == 200:
        file_path = os.path.join(SAVE_DIR, file_name)
        with open(file_path, "wb") as file:
            for chunk in response.iter_content(1024):
                file.write(chunk)
        print(f"保存しました: {file_path}")
        return file_path
    else:
        print(f"コンテンツの取得に失敗しました: {response.status_code}")
        return None

# 音声ファイルを文字起こしする関数 (最新の OpenAI API 仕様対応)
def transcribe_audio_with_openai(file_path):
    try:
        with open(file_path, "rb") as audio_file:
            response = openai.Audio.transcribe("whisper-1", file=audio_file)
            text = response.get("text", "")
            print(f"文字起こし結果: {text}")
            return text
    except Exception as e:
        print(f"OpenAI 音声認識に失敗しました: {e}")
        return None

@app.route('/callback', methods=['POST'])
def callback():
    # リクエスト内容を取得
    signature = request.headers.get('X-Line-Signature')
    body = request.data

    if not verify_signature(signature, body):
        return jsonify({'status': 'error', 'message': 'Invalid signature'}), 403

    event = request.json
    print("受信したイベント:", event)

    # メッセージイベントを処理
    for event_data in event.get('events', []):
        if event_data.get('type') == 'message':
            message_type = event_data.get('message', {}).get('type')
            message_id = event_data.get('message', {}).get('id')
            reply_token = event_data.get('replyToken')

            if message_type in ["image", "video", "file", "audio"]:  # audio を含む
                # ファイル名の設定（ファイル名が提供されない場合はデフォルト名を生成）
                file_name = event_data.get('message', {}).get('fileName', f"{message_id}.{message_type}")

                # audio ファイルの場合、拡張子を推定
                if message_type == "audio":
                    content_type = event_data.get('message', {}).get('contentType', 'audio/m4a')
                    extension = content_type.split('/')[-1]  # MIME タイプから拡張子を抽出
                    file_name = f"{message_id}.{extension}"  # 例: audio/mpeg → .mpeg

                print(f"受信した {message_type} を保存します: {file_name}")
                file_path = download_content(message_id, file_name)

                if message_type == "audio" and file_path:
                    # OpenAI を使用して音声ファイルを文字起こし
                    transcription = transcribe_audio_with_openai(file_path)
                    if transcription:
                        reply_message(reply_token, transcription)
            elif message_type == "text":
                message = event_data.get('message', {}).get('text', '')
                print("受信したテキストメッセージ:", message)

    return jsonify({'status': 'ok'}), 200

# サーバーを起動する関数
def run():
    app.run(host='0.0.0.0', port=5000)

# Flask サーバーを別スレッドで実行
thread = threading.Thread(target=run)
thread.start()

# Cloudflare トンネルを起動し、外部アクセス用の URL を取得
!wget -N https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64
!chmod +x cloudflared-linux-amd64
!./cloudflared-linux-amd64 tunnel --url http://127.0.0.1:5000 &

コードの解説

1. 必要なライブラリのインストール

まず、以下のコマンドで必要なライブラリをインストールします。

pip install flask requests openai

2. チャネルシークレットとアクセストークンの設定

LINE Developers コンソールから取得したチャネルシークレットとアクセストークンを設定します。

CHANNEL_SECRET = "your_channel_secret"
CHANNEL_ACCESS_TOKEN = "your_channel_access_token"

3. OpenAI API キーの設定

OpenAI の API キーを設定します。

OPENAI_API_KEY = "your_openai_api_key"

4. サーバーの起動

Flask を使用してローカルサーバーを起動します。

def run():
    app.run(host='0.0.0.0', port=5000)

Cloudflare Tunnel を使用して、ローカルサーバーを外部公開します。

wget -N https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64
chmod +x cloudflared-linux-amd64
./cloudflared-linux-amd64 tunnel --url http://127.0.0.1:5000 &

5. 音声メッセージの処理

LINE Messaging API を使用して音声ファイルを受信し、保存します。その後、OpenAI Whisper API を使用して文字起こしを実行します。

def transcribe_audio_with_openai(file_path):
    with open(file_path, "rb") as audio_file:
        response = openai.Audio.transcribe("whisper-1", file=audio_file)
        text = response.get("text", "")
        return text

6. メッセージの返信

LINE Messaging API を使用して、文字起こし結果をユーザーに返信します。

def reply_message(reply_token, message):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {CHANNEL_ACCESS_TOKEN}"
    }
    data = {
        "replyToken": reply_token,
        "messages": [
            {"type": "text", "text": message}
        ]
    }
    requests.post("https://api.line.me/v2/bot/message/reply", headers=headers, json=data)

結果

LINE Bot に音声メッセージを送信すると、以下のように文字起こしされた結果が返信されます。

ユーザー: (音声メッセージ)
Bot: "わが輩は猫である。名前はまだない。どこで・・・"

まとめ

この方法を使用すると、LINE Bot を活用して音声メッセージを文字起こしし、ユーザーに応答するインタラクティブな体験を提供できます。興味のある方はぜひ試してみてください。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up