whisperとChatgptで文字起こしを自動化

Last updated at 2025-04-07Posted at 2025-04-07

whisperとMecabで文字起こしをする

環境
Os：Windows11
エディタ：Visual Studio Code
言語：Python3　
文字起こし：whisper OpenAI

各種ダウンロード

音声処理のため、ffmpegをダウンロードします。
ffmpeg　https://www.ffmpeg.org/download.html
ffmpegはシステム環境変数（Path）に新規登録しました。
次に、各ライブラリのインストールです。

pip install openai-whisper
pip install ffmpeg

仮想環境のセットアップ

仮想環境をセットアップします。

#仮想環境の作成
python -m venv venv

Shift+Ctr+Pでコマンドパレットを開き、Python:Select Interpreterで、
Python3.XX ('.venv':'venv' .\venv\Scripts\python.exeを選択します。

次に仮想環境を有効化します。terminalを開いてコマンドを打ち込みます。

#有効化
.\\venv\\Scripts\\activate
#無効化(使用後は無効化します）
deactivate

whisperのテストを行います。

test.py
import whisper
# Whisperのテスト
model = whisper.load_model("large")
print("Whisperのlargeモデルのロード成功")

エラーが出なければ、インストール成功です。

whisperで文字起こし

Whisperで文字起こしを行います。コードはChatgptに出力してもらいました。

import whisper
import os
# Whisperモデルのロード
model = whisper.load_model("large")     
# 音声ファイルのパス
audio_file = ".\\venv\\audio.mp4"       
# 文字起こしを実行
result = model.transcribe(audio_file)
# テキストを保存
with open("result.txt", "w", encoding="utf-8") as f:
    f.write(result)

largeモデルを選択したのでもう少し時間がかかると想定していましたが、
約2時間ほどの録音が3時間で文字起こしできました。

資料(テキスト)をMecabで形態素分析して取り込んで、
文字起こししたテキストを修正するプログラムができないかなと考えています。

参考文献

https://github.com/openai/whisper/discussions/696

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up