More than 1 year has passed since last update.

Python 動画の入出力ベンチマークテスト OpenCV PyAV FFStream

Posted at 2024-02-05

はじめに

FFStreamの処理速度がどの程度かOpenCVおよびPyAVと比較します。
結果はPCの構成によりマチマチですので興味がありましたらテストできるよう Python用のスクリプトを用意しましたのでお試しください。

テスト内容

動画を次のように処理し全てのフレームを処理するのにかかった時間を計測します。

入力：動画を読み込む（デコード）
転送：ビデオメモリへ転送する
画像処理：ビデオメモリ上でCUDAを使って青成分を0にする
転送：メインメモリへ転送する
出力：動画を出力する（エンコード）

テストに使った動画
Full HD 1920x1080 29.97fps 約5分40秒 10193フレーム avc1:Main@L4.2 132MB 3Mbps

テスト環境

PC1

項目	内容
OS	Microsoft Windows 11 Home 64-bit
CPU	Intel Core i7 13700K 5.3GHz 16cores 24threads
GPU	NVIDIA GeForce RTX 4090

PC2

項目	内容
OS	Microsoft Windows 10 Home 64-bit
CPU	Intel Core i7 7700K 4.2GHz 4cores 8threads
GPU	NNVIDIA GeForce GTX 1050 Ti

ソフトウエア

Python v3.11.4
OpenCV
- opencv-python==4.9.0.80
- openh264-1.8.0-win64.dll ダウンロードページ
PyAV
- av==10.0.0
FFStream
- FFStream v1.3.0 ダウンロードページ
- ffmpeg-python==0.2.0
- ffmpeg
  ffmpeg-2020-12-20-git-ab6a56773f-full_build.zip ダウンロードページ

OpenCV

OpenCVの使用例で多く使われている'mp4v'と一般的なコーデックの'avc1:base'（openh264-1.8.0-win64.dll使用）でテストしました。
あと、avc1でdllが見つからない場合はエラーとなるようですが私の環境では何かしらのライブラリが使われavc1の動画が作成されました。
このときタスクマネージャーのGPUのエンコーダー負荷が上がっていることからnvencが使われていると思われます。不明点が多いですが参考値として掲載します。

OpenCVで'avc1'を指定してdllが見つからない場合

codec = cv2.VideoWriter_fourcc(*'avc1')
writer = cv2.VideoWriter(args.output_path, codec, fps, (w, h),1)

実行時に'openh264-1.8.0-win64.dll'が見つからないと．．．

Failed to load OpenH264 library: openh264-1.8.0-win64.dll
Please check environment and/or download library: https://github.com/cisco/openh264/releases

と表示されるが処理が進む

このときGPUのエンコーダーの負荷が上がっていることからnvencが働いていると思われる

処理が終わると約60Mbpsのバカでかい動画ファイルが作成される。

PyAV

入力動画と同じエンコーダーを指定しています。
今回は'avc1：high'となります。

FFStream

標準で'avc1：high'となりますが、OpenCVと比較するために'mp4v'とnvencでもテストしました。

結果

PC1 : Core i7 13799K 5.3GHz 16c 24t / RTX4090

処理	エンコーダー	filesize	bit rate	処理時間	fps
OpenCV	mp4v	641 MB	16 Mbps	2:55	58.14 fps
OpenCV	avc1:base	931 MB	23 Mbps	3:34	47.56 fps
PyAV	avc1:high	395 MB	10 Mbps	4:39	36.47 fps
FFStream	avc1:high	315 MB	8 Mbps	1:21	125.54 fps
FFStream	mp4v	406 MB	10 Mbps	0:44	226.96 fps
OpenCV	avc1:nvenc	2360 MB	58 Mbps	1:26	117.93 fps
FFStream	avc1:nvenc	243 MB	6 Mbps	0:42	241.35 fps

PC2 : Core i7 7700K 4.2GHz 4c 8t / GTX1050ti

処理	エンコーダー	filesize	bit rate	処理時間	fps
OpenCV	mp4v	641 MB	16 Mbps	3:25	49.65 fps
OpenCV	avc1:base	930 MB	23 Mbps	4:24	38.47 fps
PyAV	avc1:high	395 MB	10 Mbps	8:13	20.64 fps
FFStream	avc1:high	316 MB	8 Mbps	4:05	41.49 fps
FFStream	mp4v	405 MB	10 Mbps	1:49	93.36 fps
OpenCV	avc1:nvenc	2314 MB	57 Mbps	1:53	90.14 fps
FFStream	avc1:nvenc	243 MB	6 Mbps	1:49	92.69 fps

avc1:base=Constrained Baseline@L4.1
avc1:high=High@L4

考察

avc1（ソフトウエアエンコード）についてOpenCVとFFStreamは4コア8スレッドでは同程度、コア数が増えるに従いFFStreamが有利となる傾向があります。
OpenCVはコア数の影響は少なく動作クロックに依存しているように思われます。
PyAVは少し遅い結果となりました。

FFStreamのmp4vはnvencに匹敵する結果となりました。webブラウザでの再生ができないなど欠点もありますが選択肢になるかもしれません。

ファイルサイズについてはFullHD 30fps 340秒なので6Mbps相当の243MByteあたりが目安と思いますが、ビットレートや画質設定ができない？OpenCVはファイルサイズが大きすぎて厳しい気がします。

テスト用スクリプト

OpenCV

test_cv.py

import torch
from tqdm import tqdm
import cv2
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('input_path'          , type=str  , help='input video file(full path)')
parser.add_argument('output_path'         , type=str  , help='output video file(full path)')
parser.add_argument('-e', '--encoder'     , type=str  , default='mp4v', choices=['mp4v', 'avc1'], help='encoder fourcc')
parser.add_argument('-c', '--cpu'         , action='store_true', help='use cpu')
args = parser.parse_args()

device = torch.device('cpu' if args.cpu else 'cuda')
_COLOR = '\033[92m'
_END   = '\033[0m'
print(f'device={device}, {_COLOR}{args.output_path}{_END}')

cap    = cv2.VideoCapture(args.input_path)
w      = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h      = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps    = cap.get(cv2.CAP_PROP_FPS)

#codec = cv2.VideoWriter_fourcc(*'mp4v')
#codec = cv2.VideoWriter_fourcc(*'avc1')
codec  = cv2.VideoWriter_fourcc(*args.encoder)
writer = cv2.VideoWriter(args.output_path, codec, fps, (w, h),1)

bar = tqdm(total=frames, dynamic_ncols=True)
while True :
    ret, img = cap.read()
    if ret == False:
        break
    
    # 受け取ったフレームを ビデオメモリへ転送して 処理する
    #img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)         # BGRをRGB配置へ変換する
    frame_t = torch.frombuffer(img, dtype=torch.uint8)
    frame_rgb = frame_t.to(device)
    frame_rgb = frame_rgb.reshape(h,w,3)
    frame_rgb = frame_rgb.permute(2, 0, 1)              # 配列を整形 [C,H,W] BBBBGGGGRRRR
    frame_rgb = frame_rgb.float().div(255)              # 0から1の実数にする

    # BGRの処理
    frame_rgb[0]=0                                      # 青成分を0にする

    frame_rgb = frame_rgb.mul(255).byte()               # 0から255の整数にする
    frame_rgb = frame_rgb.permute(1, 2, 0)              # 配列を整形 [H,W,C] BGRBGRBGRBGR
    # メインメモリへ転送する 
    frame_np = frame_rgb.cpu().numpy()
    #frame_np = cv2.cvtColor(frame_np, cv2.COLOR_RGB2BGR)   # RGBをBGR配置へ変換する
    writer.write(frame_np)

    bar.update(1)
bar.close()
cap.release()
writer.release()

PyAV

test_pyav.py

import torch
from tqdm import tqdm
import argparse
import av
import ffstream.ycbcr as ycbcr

parser = argparse.ArgumentParser()
parser.add_argument('input_path'          , type=str  , help='input video file(full path)')
parser.add_argument('output_path'         , type=str  , help='output video file(full path)')
parser.add_argument('-m', '--mode'        , type=int  , default=0, choices=[0, 1, 2], help='mode 0=pyav  1=copy  2=ycbcr')
parser.add_argument('-c', '--cpu'         , action='store_true', help='use cpu')
args = parser.parse_args()

device = torch.device('cpu' if args.cpu else 'cuda')

container_input  = av.open(args.input_path)
container_output = av.open(args.output_path, 'w')
mbps = 10

stream_input  = container_input.streams.video[0]
stream_output = container_output.add_stream(stream_input.codec_context.name, rate=stream_input.average_rate)
stream_output.pix_fmt    = stream_input.codec_context.pix_fmt
stream_output.bit_rate   = mbps*1000000
w      = stream_output.width  = stream_input.codec_context.width
h      = stream_output.height = stream_input.codec_context.height
frames = stream_input.frames

_COLOR = '\033[92m'
_END   = '\033[0m'
print(f'codedc={stream_input.codec_context.name}, device={device}, {_COLOR}{args.output_path}{_END}')

if (args.mode==2):
   frame_rgb = torch.zeros(3,h,w, device=device)

bar = tqdm(total=frames, dynamic_ncols=True)
for packet in container_input.demux():
    if packet.dts is None:
        continue

    if (packet.stream.type == 'video'):
        for frame in packet.decode():

            if (args.mode==0):
                # PyAVの機能を使て RGB YUV変換を行う
                frame_rgb = frame.to_rgb()                              # YUVをRGBへ変換する
                frame_t   = torch.frombuffer(frame_rgb.planes[0], dtype=torch.uint8)
                frame_rgb = frame_t.to(device)
                frame_rgb = frame_rgb.float().div(255)                  # 0から1の実数にする
                frame_rgb = frame_rgb.reshape(h,w,3)
                frame_rgb = frame_rgb.permute(2, 0, 1)                  # 配列を整形 [C,H,W] RRRRGGGGBBBB

                # RGBの処理
                frame_rgb[2]=0                                          # 青成分を0にする

                frame_rgb = frame_rgb.permute(1, 2, 0)                  # 配列を整形 [H,W,C] RGBRGBRGBRGB
                frame_rgb = frame_rgb.mul(255).byte()                   # 0から255の整数にする

                # メインメモリへ転送する 
                frame_np = frame_rgb.cpu().numpy()
                
                # RGBデータから 新しいYUVフレームを作る
                frame_yuv = av.VideoFrame.from_ndarray(frame_np, format='rgb24')
                container_output.mux(stream_output.encode(frame_yuv))

            elif args.mode==1:
                # PyAVのエンコード、デコード性能を確認するため 受け取ったフレームをそのまま出力する （ビデオメモリへは転送しない RGBにもしない）
                # 受け取ったフレームをそのまま出力したいけど タイムスタンプが合わない（変更もできない）ので
                # 新しいフレームを作成し 受け取ったフレームをコピーする
                frame_yuv = av.VideoFrame(w, h, format='yuv420p')
                frame_yuv.planes[0].update(frame.planes[0])
                frame_yuv.planes[1].update(frame.planes[1])
                frame_yuv.planes[2].update(frame.planes[2])
                container_output.mux(stream_output.encode(frame_yuv))

            elif args.mode==2:
                # FFStreamのycbcrでRGB YUV変換する例
                # 受け取ったフレームを ビデオメモリへ転送して RGBへ変換し 処理する
                frame_y = torch.frombuffer(frame.planes[0], dtype=torch.uint8).to(device)
                frame_u = torch.frombuffer(frame.planes[1], dtype=torch.uint8).to(device)
                frame_v = torch.frombuffer(frame.planes[2], dtype=torch.uint8).to(device)
                frame_yuv = torch.cat((frame_y, frame_u, frame_v), dim=0)

                ycbcr.yuv420_to_rgb(frame_rgb, frame_yuv, w, h, ycbcr.COLOR_SPACE_BT709)

                # RGBの処理
                #frame_rgb = torch.clamp(frame_rgb, 0, 1)                       # 0から1の範囲を超える場合がある 必要な場合は制限する
                frame_rgb[2]=0    # 青成分を0にする

                # RGBをYUVへ変換する
                ycbcr.rgb_to_yuv420(frame_yuv, frame_rgb, w, h, ycbcr.COLOR_SPACE_BT709)

                # メインメモリへ転送する
                frame_np  = frame_yuv.cpu().numpy()

                # 新しいPyAVのYUVフレームを作る
                frame_yuv = av.VideoFrame(w, h, format='yuv420p')
                frame_yuv.planes[0].update(frame_np[0:w*h])
                frame_yuv.planes[1].update(frame_np[w*h:w*h+w//2*h//2])
                frame_yuv.planes[2].update(frame_np[w*h+w//2*h//2:])
                container_output.mux(stream_output.encode(frame_yuv))

            bar.update(1)
container_output.mux(stream_output.encode())
bar.close()
container_output.close()
container_input .close()

FFStream

test_ffstream.py

import torch
import argparse
from tqdm import tqdm
from ffstream.ffstream import FFStream

parser = argparse.ArgumentParser()
parser.add_argument('input_path'            , type=str  , help='input video file(full path)')
parser.add_argument('output_path'           , type=str  , help='output video file(full path)')

parser.add_argument('-p', '--pix_fmt'       , type=str  , default='yuv420p', choices=['yuv420p', 'rgb24'], help='pixel format')
parser.add_argument('-r', '--crf'           , type=int  , default=20       , help='crf')
parser.add_argument('-i', '--input_options' , type=str  , default=None     , help='ffmpeg input options')
parser.add_argument('-o', '--output_options', type=str  , default=None     , help='ffmpeg output options')
parser.add_argument('-c', '--cpu'           , action='store_true', help='use cpu')
parser.add_argument('-a', '--audio_copy'    , action='store_true', help='use audio copy')
parser.add_argument('-y', '--ycbcr_cuda'    , action='store_true', help='use YCbCr cuda native module')
args = parser.parse_args()

device = torch.device('cpu' if args.cpu else 'cuda')
if args.ycbcr_cuda:
    # RGB・YUV 変換処理 CUDA ネイティブ版 CUDA専用(早い) CPUでは使えない FP16非対応
    import ffstream.ycbcr_cuda as ycbcr
else:
    # RGB・YUV 変換処理 Python版 CUDA(遅い)およびCPU(すごく遅い)で利用可能 FP16対応
    import ffstream.ycbcr as ycbcr

_COLOR = '\033[92m'
_END   = '\033[0m'
print(f'op={args.input_options}/{args.output_options}, pix={args.pix_fmt}, crf={args.crf}, a_copy={args.audio_copy}, dev={device}, ycb_cuda={args.ycbcr_cuda}, {_COLOR}{args.output_path}{_END}')
# 入力ファイル 出力ファイル 画質 ピクセルフォーマット
# 音声がある場合は音声をコピーする フレーム処理に使うデバイスを指定する 入力オプション 出力オプション
ff = FFStream(args.input_path, args.output_path, crf=args.crf, pix_fmt=args.pix_fmt,
              copy_audio_stream=args.audio_copy, device=device, input_options=args.input_options, output_options=args.output_options)
    
w = ff.width        # 入力動画の 画像サイズ
h = ff.height       # 入力動画の 画像サイズ
frames = ff.frames  # 入力動画の 総フレーム数

if args.pix_fmt == 'yuv420p':
    frame_rgb = torch.zeros(3,h,w, device=device)
    bar = tqdm(total=frames, dynamic_ncols=True)
    while True:
        frame_bytes = ff.recv_frame()                               # [0] : YYYYUV
        if (frame_bytes == None) :
            break                                                   # pipeが終了したら ループを抜ける

        # ビデオメモリへ転送する
        frame_t = torch.frombuffer(frame_bytes, dtype=torch.uint8)  # warning  PyTorch does not support non-writeable tensors.
        frame_yuv = frame_t.to(device)

        # YUVをRGBへ変換する
        ycbcr.yuv420_to_rgb(frame_rgb, frame_yuv, w, h, ycbcr.COLOR_SPACE_BT709)

        # RGBの処理
        frame_rgb[2]=0                                              # 青成分を0にする  [C,H,W] : RRRRGGGGBBBB

        # RGBをYUVへ変換する
        ycbcr.rgb_to_yuv420(frame_yuv, frame_rgb, w, h, ycbcr.COLOR_SPACE_BT709)

        # メインメモリへ転送する
        frame_np  = frame_yuv.cpu().numpy()

        # FFmpegの出力ストリームへYUVのフレームを送る
        ff.send_frame(frame_np)                                     # [0] : YYYYUV

        bar.update(1)
    bar.close()
    ff.close()

elif args.pix_fmt == 'rgb24':
    bar = tqdm(total=frames, dynamic_ncols=True)
    while True:
        # FFmpegからRGBフレームを受け取る
        frame_rgb = ff.get_rgb()
        if (frame_rgb == None) :
            break                                       # 動画の最後に達した

        # RGBの処理
        frame_rgb[2]=0                                  # 青成分を0にする  [C,H,W] : RRRRGGGGBBBB

        # FFmpegの出力ストリームへRGBフレームを送る
        ff.put_rgb(frame_rgb)

        bar.update(1)
    bar.close()
    ff.close()

テスト用バッチファイル

test_all.bat

set input_path=input.mp4
python test_cv.py %input_path% output_cv_mp4v.mp4         --encoder mp4v
python test_cv.py %input_path% output_cv_avc1_base.mp4    --encoder avc1
python test_cv.py %input_path% output_cv_avc1_nvenc.mp4   --encoder avc1
python test_pyav.py %input_path% output_pyav_mode0.mp4    --mode 0
python test_ffstream.py %input_path% output_ff_avc1_high.mp4   -p yuv420p
python test_ffstream.py %input_path% output_ff_mp4v.mp4        -p yuv420p  -o """-c:v mpeg4 -vtag mp4v -qscale:v 5"""
python test_ffstream.py %input_path% output_ff_avc1_nvenc.mp4  -p yuv420p  -o """-c:v h264_nvenc -b:v 6M"""

（私の環境の場合）OpenCVのnvencは'avc1'を指定したときに'openh264-1.8.0-win64.dll'が見つからないときに使えるので以下のように手動で切り替えてテストします。

output_cv_avc1_base.mp4を実行するときは 'openh264-1.8.0-win64.dll'をtest_cv.pyと同じフォルダへコピーする
output_cv_avc1_nvenc.mp4を実行するときは 'openh264-1.8.0-win64.dll'をtest_cv.pyと同じフォルダに置かない

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up