@ryou1128posted at 2024-07-12

TensortRT をリアルタイム再生で躓いています

Q&A

Closed

解決したいこと

TensortRTでOnnxファイルをエンジンに変換し、独自の動画再生システムでリアルタイム再生したいのですが、エンジンの生成で引っかかっています。

まずTensortRTで動かしたいのはReal ESRGANの中でもリアルタイム再生が可能な"SRVGGNetCompact"いわゆる最近AI動画再生で一部で流行ってる　Compact　モデルですね。モデルの投稿サイトもちらほらと。(https://openmodeldb.info/)

Vaporysnthを使って MPVプレイヤー等で動かすのが当たり前なのですが私変なのか独自の再生システムで再生させてみたいという欲に駆られ現在の問題に至っております。

neosrや
traiNNer-redux
で作った自前のOnnxファイル2xモデル（pth変換はchainnerで行いました）を使ってテストしています。

MPVで動かしたところ20fpsは出ています。だったらMPVで再生すればいいという話なのですがいろいろ設定等で試してみたいと思ったら落とし穴にハマりました。

事前にChatGPTで2日ほど戦ってみましたが以下のエラーコードで止まっています。

OpenCVやFFmpeg、FFplayを組み合わせたコードだと再生はされるのですが映像がスローになってしまうという問題があったため、最近出たFFStreamで処理したいと思っております。

def process_with_ai_engine(frame_rgb, engine):　の辺りでエラーが起きているみたいです。

UIで動画とonnxを選択してエンジンを作成している過程はタスクマネージャのGPU使用率が100%になるのでわかるのですが途中で下記のエラーがコンソールに表示されます。

どなたか解決策を知っている方はアンサーをお願いいたします。(__)

スペック

Win11
CPU I7 13700kF
GPU RTX 3080TI
RAM 64GB DDR4
python 3.11.4

発生している問題・エラー


Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Users\ryou\AppData\Local\Programs\Python\Python311\Lib\tkinter\__init__.py", line 1948, in __call__
    return self.func(*args)
           ^^^^^^^^^^^^^^^^
  File "C:\Smooth Video\video_degradation_app\main.py", line 102, in process_and_play_video_ffstream
    frame_rgb = process_with_ai_engine(frame_rgb, engine)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Smooth Video\video_degradation_app\main.py", line 57, in process_with_ai_engine
    context.set_binding_shape(input_binding_index, frame_rgb.shape)
    ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'tensorrt_bindings.tensorrt.IExecutionContext' object has no attribute 'set_binding_shape'

書いたコード

import os
import tkinter as tk
from tkinter import filedialog, messagebox
import torch
from ffstream.ffstream import FFStream
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

def build_engine(onnx_file_path):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        config = builder.create_builder_config()
        config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 30)  # 1GB
        config.set_flag(trt.BuilderFlag.FP16)

        # Parse model file
        with open(onnx_file_path, 'rb') as model:
            if not parser.parse(model.read()):
                print('Failed to parse the ONNX file.')
                for error in range(parser.num_errors):
                    print(parser.get_error(error))
                return None

        # Create optimization profile
        profile = builder.create_optimization_profile()
        for i in range(network.num_inputs):
            input_name = network.get_input(i).name
            profile.set_shape(input_name, min=(1, 3, 64, 64), opt=(1, 3, 1080, 1920), max=(1, 3, 1080, 1920))
        config.add_optimization_profile(profile)

        # Build engine
        engine_bytes = builder.build_serialized_network(network, config)
        if engine_bytes is None:
            print('Failed to create engine')
            sys.exit(1)

        with trt.Runtime(TRT_LOGGER) as runtime:
            engine = runtime.deserialize_cuda_engine(engine_bytes)

        return engine
 
def process_with_ai_engine(frame_rgb, engine):
    # Create execution context
    context = engine.create_execution_context()

    # Allocate device memory
    d_input = cuda.mem_alloc(frame_rgb.nelement() * frame_rgb.element_size())
    d_output = cuda.mem_alloc(frame_rgb.nelement() * frame_rgb.element_size())

    # Transfer input data to device
    cuda.memcpy_htod(d_input, frame_rgb.cpu().numpy().tobytes())

    # Set input shape
    input_binding_index = 0  # Assume the first binding is the input
    context.set_binding_shape(input_binding_index, frame_rgb.shape)

    # Execute model
    bindings = [int(d_input), int(d_output)]
    context.execute_v2(bindings)

    # Transfer predictions back
    output = torch.empty_like(frame_rgb)
    cuda.memcpy_dtoh(output.data_ptr(), d_output)

    return output

def select_video():
    file_path = filedialog.askopenfilename(filetypes=[("Video files", "*.mp4;*.avi;*.mov;*.mkv;*.m2ts;*.h264")])
    if file_path:
        video_path.set(file_path)

def select_model():
    file_path = filedialog.askopenfilename(filetypes=[("ONNX files", "*.onnx")])
    if file_path:
        model_path.set(file_path)

def process_and_play_video_ffstream():
    video_path_str = video_path.get()
    model_path_str = model_path.get()

    if not video_path_str or not model_path_str:
        messagebox.showerror("Error", "Please select both video file and model file")
        return

    device = torch.device('cuda')  # ビデオメモリを使って CUDAで処理する(早い)

    # 入力パス, 処理が速い, 動画の音声をコピーできる, フレーム処理用のデバイスを指定する
    ff = FFStream(video_path_str, 'output.mp4', crf=20, pix_fmt='yuv420p', copy_audio_stream=True, device=device)

    # Build TensorRT engine from ONNX file
    engine = build_engine(model_path_str)

    while True:
        # FFmpegからRGBフレームを受け取る
        frame_rgb = ff.get_rgb()
        if frame_rgb is None:
            break  # 動画の最後に達した

        # TensorRTエンジンを使用してframe_rgbを処理
        frame_rgb = process_with_ai_engine(frame_rgb, engine)

        # FFmpegの出力ストリームへRGBフレームを送る
        ff.put_rgb(frame_rgb)

    ff.close()

app = tk.Tk()
app.title("Video Processing App")

video_path = tk.StringVar()
model_path = tk.StringVar()

tk.Label(app, text="Select Video File:").grid(row=0, column=0, padx=10, pady=10)
tk.Entry(app, textvariable=video_path, width=50).grid(row=0, column=1, padx=10, pady=10)
tk.Button(app, text="Browse", command=select_video).grid(row=0, column=2, padx=10, pady=10)

tk.Label(app, text="Select ONNX Model File:").grid(row=1, column=0, padx=10, pady=10)
tk.Entry(app, textvariable=model_path, width=50).grid(row=1, column=1, padx=10, pady=10)
tk.Button(app, text="Browse", command=select_model).grid(row=1, column=2, padx=10, pady=10)

tk.Button(app, text="Process and Play Video", command=process_and_play_video_ffstream).grid(row=2, columnspan=3, padx=10, pady=20)

app.mainloop()

0 likes

Are you sure you want to delete the question?

TensortRT をリアルタイム再生で躓いています

解決したいこと

発生している問題・エラー

2Answer

Comments

Your answer might help someone💌