More than 3 years have passed since last update.

PythonでFFmpegを使う方法 - Juliusの使用に向けて -

Posted at 2022-03-13

今回は、大語彙連続音声認識エンジンJuliusを使いたい。
Juliusを使うためには、音声ファイルを以下の形式に変換しておく必要があります。

チャンネル数：1チャンネル（モノラル）
サンプルレート：16000 Hz
ファイル形式：.wavファイル、ヘッダなしRAWファイル

今回は、PythonでFFmpegを実行してJuliusで使用できる形式に変換する。

FFmpegのオプション

サンプルレート

以下のように設定する。

-ar [サンプルレート] # 例：-ar 16000

チャンネル数

以下のように設定する。

-ac [チャンネル数] # 例：-ac 1

コマンドプロンプトでFFmpegを実行

まず、コマンドプロンプトで以下のようにFFmpegを実行して、Julius用に変換する。

ffmpeg -i [input_file] -ar 16000 -ac 1 output.wav

次のように変更結果を確認する。

ffprobe -i output.wav

結果が以下の通りになればOK

Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels

PythonでFFmpegを実行

ソースコードを以下に載せております。

ffmpeg.py

import subprocess

input_file = "input.mp4"
output_file = "output.wav"

args = ["ffmpeg", "-i", input_file, "-ar", "16000", "-ac", "1", output_file]

subprocess.run(args, stdout=subprocess.PIPE)

コマンドプロンプトで実行した時と同じ結果が出力されればOK。
以上、Windows環境のPythonでFFmpegを使う方法でした。

参考にしたページ

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up