Google ColabでSpeechToTextしたい！

Posted at 2025-02-09

インストールして

!pip install --upgrade google-cloud-speech

からの

speechToText.py

from google.cloud import speech_v1p1beta1 as speech
import time

# サービスアカウントファイルを使用して認証する
client = speech.SpeechClient.from_service_account_json('credentials.json')

# 音声ファイルが格納されている実際の GCS バケット URI に置き換えてください
gcs_uri = "gs://test/test.mp3"

audio = speech.RecognitionAudio(uri=gcs_uri)
config = speech.RecognitionConfig(
    encoding=speech.RecognitionConfig.AudioEncoding.MP3,  # ファイルタイプに合わせて更新済み
    language_code="ja-JP",  # 言語を日本語に設定
    sample_rate_hertz=16000  # ファイルのサンプルレートと一致していることを確認
)

# 大きなファイルに対して非同期認識を使用する
operation = client.long_running_recognize(config=config, audio=audio)
print("Waiting for operation to complete...")
while not operation.done():
    # metadata が取得できる場合は進捗を表示（存在しなければ「処理中」と表示）
    metadata = operation.metadata
    if metadata is not None and hasattr(metadata, "progress_percent"):
        print("進捗: {}%".format(metadata.progress_percent))
    else:
        print("処理中...")
    time.sleep(10)
print("認識処理完了！")
response = operation.result()

# 転写結果を表示
for result in response.results:
    print("Transcript: {}".format(result.alternatives[0].transcript))

できた。
あらかじめ音声ファイルは「Cloud Storage」に入れておいてね
あと認証ファイルも必要っぽい（サンプルは使ってなかった・・）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up