フォルダ内のm3u8ファイルを読み込んで、mp4にダウンロードするスクリプト

Posted at 2024-04-24

フォルダ内のm3u8ファイルを読み込んで、mp4にダウンロードするスクリプト

この記事では、指定されたフォルダ内の.m3u8ファイルを見つけ出し、そのURLから.mp4として動画をダウンロードするPythonスクリプトの作成方法について解説します。また、ファイルのエンコーディングが異なる場合の対処方法についても触れます。

使用技術

Python 3: スクリプト言語として使用。
subprocess: 外部プロセス（ffmpeg）を実行するために使用。
concurrent.futures: Pythonの標準ライブラリで、非同期実行を扱うため。
re: 正規表現を利用し、ファイル内からURLを検索するため。

ソースコードの解説

ダウンロード関数: `download_m3u8`

この関数はffmpegコマンドを利用して、指定された.m3u8 URLから直接.mp4ファイルへと変換・ダウンロードします。ffmpegは強力なメディア処理ツールで、ここではストリーミングデータを受け取って単一の動画ファイルとして保存します。

def download_m3u8(url, folder_path, filename):
    output_filename = f"{os.path.basename(folder_path)} {filename}".replace(".m3u8", ".mp4")
    output_filepath = os.path.join(folder_path, output_filename)
    command = ['ffmpeg', '-i', url, '-c', 'copy', '-bsf:a', 'aac_adtstoasc', output_filepath]
    try:
        subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
        print(f"Download completed: {output_filepath}")
    except subprocess.CalledProcessError as e:
        print(f"Error during download of {url}: {e.stderr.decode()}")

subprocess.run() は外部プロセスを同期的に実行し、完了を待ちます。
エラー処理を含め、ffmpegの実行が失敗した場合には例外がキャッチされ、エラーメッセージが出力されます。

エンコーディング対応読み込み関数: `read_file_with_fallback`

異なるエンコーディングを持つファイルを読み込むための関数です。最初にutf-8で試み、失敗した場合にはshift-jisで読み込みます。

def read_file_with_fallback(file_path, encodings):
    for encoding in encodings:
        try:
            with open(file_path, 'r', encoding=encoding) as file:
                return file.read()
        except UnicodeDecodeError:
            continue
    print(f"Warning: Failed to decode {file_path} with any of the provided encodings: {encodings}")
    return None

複数のエンコーディングを試すことで、さまざまなソースのファイルに対応します。
読み込みが成功すると内容を返し、全てのエンコーディングで失敗すると警告を出力します。

メインの処理関数: `download_m3u8_to_mp4`

指定されたディレクトリ内のファイルをスキャンし、.m3u8 URLを抽出してダウンロードを実行します。マルチスレッディングを利用して、複数のダウンロードを並行して

行います。

def download_m3u8_to_mp4(folder_path):
    futures = []
    initial_listdir = os.listdir(folder_path)
    with ThreadPoolExecutor(max_workers=len(initial_listdir)) as executor:
        for filename in initial_listdir:
            file_path = os.path.join(folder_path, filename)
            if os.path.isfile(file_path):
                content = read_file_with_fallback(file_path, ['utf-8', 'shift-jis'])
                if content:
                    urls = re.findall(r'https?://\S+\.m3u8', content)
                    for url in urls:
                        future = executor.submit(download_m3u8, url, folder_path, filename)
                        futures.append(future)
        for future in futures:
            future.result()
    print("All downloads are complete.")

ThreadPoolExecutor を用いてファイルごとに非同期タスクを生成し、それぞれのファイルからURLを抽出しダウンロードを実行します。
正規表現 re.findall() でファイル内容から.m3u8 URLを探し出します。

使用方法

スクリプトはコマンドラインから実行されます。以下のように使用します。

python script.py /path/to/folder

このスクリプトを実行すると、指定されたディレクトリ内の.m3u8ファイルが検索され、見つかったURLから動画がダウンロードされます。エンコーディングの違いにも対応しているため、多様なファイルソースに適応可能です。

まとめ

このスクリプトを利用することで、フォルダ内の.m3u8ファイルから直接.mp4に変換してダウンロードする自動化プロセスを構築できます。エンコーディングの違いに柔軟に対応することも重要であり、このスクリプトはその両方のニーズを満たします。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

フォルダ内のm3u8ファイルを読み込んで、mp4にダウンロードするスクリプト

フォルダ内のm3u8ファイルを読み込んで、mp4にダウンロードするスクリプト

使用技術

ソースコードの解説

ダウンロード関数: download_m3u8

エンコーディング対応読み込み関数: read_file_with_fallback

メインの処理関数: download_m3u8_to_mp4

使用方法

まとめ

ダウンロード関数: `download_m3u8`

エンコーディング対応読み込み関数: `read_file_with_fallback`

メインの処理関数: `download_m3u8_to_mp4`