More than 3 years have passed since last update.

AWS Transcribeで音声文字起こし

Posted at 2021-12-12

アーキテクチャ

input bucketへのアップロードをトリガーに音声ファイルを文字起こし

手順

1.　bucketの作成（インプットとアウトプット用の２つ）
2.　IAMロールの作成
3.　transcribe_functionの作成

S3バケットの作成

Lambdaのトリガー設定時に、下記のような注意書きがあるように、入出力のバケットは分けた方が良い

IAMロールの作成

設計図の使用を選択して、関数を作成する

S3とTranscribeへのフルアクセス権限を付与
操作内容が明白な場合は絞り込んだ方が良いが、個人環境なので、一旦はフルアクセス権限を付与

Lambda関数の作成

既存のロールに先ほど作成したロールを選択
s3トリガーにinput bucketを指定する

transcribe_function

import json
import urllib.parse
import boto3
import datetime

s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')

def lambda_handler(event, context):
    #print("Received event: " + json.dumps(event, indent=2))
    
    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:
        transcribe.start_transcription_job(
            TranscriptionJobName= datetime.datetime.now().strftime("%Y%m%d%H%M%S") + '_Transcription',
            LanguageCode='ja-JP',
            Media={
                'MediaFileUri': 'https://s3.ap-northeast-1.amazonaws.com/' + bucket + '/' + key
            },
            Setting={
                'ShowSpeakerLabels': 'True',
                'MaxSpeakerLabels': 2,
            },
            OutputBucketName='your-output-bucket-name'
        )
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

input bucket　の情報取得

transcribe_function

import boto3
s3 = boto3.client('s3')

# トリガーとなるインプットバケットからバケットとオブジェクト名を取得
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')

transcribeの実装

transcribe_function

transcribe.start_transcription_job(
            TranscriptionJobName= datetime.datetime.now().strftime("%Y%m%d%H%M%S") + '_Transcription',
            LanguageCode='ja-JP',
            Media={
                'MediaFileUri': 'https://s3.ap-northeast-1.amazonaws.com/' + bucket + '/' + key
            },
            Setting={
                'ShowSpeakerLabels': 'True',
                'MaxSpeakerLabels': 2,
            },
            OutputBucketName='your-output-bucket-name'
        )

transcribejobは一意である必要があるため、date関数を使用
言語は、日本語を指定
スピーカーの数を指定

その他、パラメータの詳細は下記を参照

output bucketの中身を確認

テキスト化されたファイルがjson形式で出力されている
インデントが整形されていないので、整形ツールを使用して整える

文字起こし性能を向上させたい

スピーカーの数を指定
カスタム語彙（固有名詞、専門用語）の登録

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

AWS Transcribeで音声文字起こし

アーキテクチャ

手順

S3バケットの作成

IAMロールの作成

Lambda関数の作成

input bucket の情報取得

transcribeの実装

output bucketの中身を確認

文字起こし性能を向上させたい

input bucket　の情報取得