PyAVとは
「PyAVで動画の任意のフレームを取得する」を参照してください.
動画ファイルの読み込みと情報の取得
動画ファイルに使われるコンテナ・コーデックは様々です.それらの情報を確認したいと思ったのですが,あまり記事がないので作成してみました.
- ファイルサイズ
- ビットレート
- 動画長(秒)
- フレーム数
- フレームレート(fps)
- 縦・横
- キーフレム判定
またフレームをndarray
に変換する方法がちょっとトリッキーだったので工夫してみました.
import av
def load_from_file(
filename,
any_frame=False,
backward=True
):
container = av.open(filename)
stream = container.streams.video[0] # 1つ目のvideo stream.普通は1つしかないからこれでOK
print("filename:", container.name)
# print("filesize [bytes]:", container.size)
print("filesize [kB]:", container.size // 1024)
if container.size > 1024 * 1024:
print("filesize [MB]:", container.size // 1024 // 1024)
# print("bit_rate [b/s]:", container.bit_rate)
print("bit_rate [kb/s]:", float(container.bit_rate) / 1024)
print("container duration [sec]:",
float(container.duration) / av.time_base)
print("stream duration:", stream.duration) # 謎
print("frames:", stream.frames)
print("guessed frames:",
int(float(container.duration) / av.time_base * stream.base_rate))
print("container format name:", container.format.name)
print("container format long name:", container.format.long_name)
print("codec name:", stream.codec_context.codec.name)
print("codec long name:", stream.codec_context.codec.long_name)
print("codec tag:", stream.codec_context.codec_tag)
for md_str in container.metadata.keys():
print(f"container metadata {md_str}:", container.metadata[md_str])
print("base_rate [fps]:", stream.base_rate)
print("rate [fps]:", stream.codec_context.rate)
print("width [pix]:", stream.codec_context.width)
print("height [pix]:", stream.codec_context.height)
sec = 2 # 2秒時点へシーク
container.seek(
offset=sec // stream.time_base,
any_frame=any_frame,
backward=backward,
stream=stream)
for i, frame in enumerate(container.decode(video=0)):
print(f"i:{i:3d}, time:{frame.time:.3f}, pts:{frame.pts}, "
f"{frame.width}x{frame.height}, {frame.format.name} ",
end="")
if frame.key_frame:
print("keyframe", end="")
print()
# ffmpegの機能でリサイズ.おそらく高速.
frame = frame.reformat(width=244, height=244)
# 以下ndarrayへの変換
img = frame.to_ndarray() # このままだと1チャンネルndarray.多分Yチャンネルのgray scale(未確認.コーデックによってはエラー発生)
img = frame.to_rgb().to_ndarray() # 3チャンネルndarray(RGB)
img = frame.to_ndarray(format="rgb24") # 同上
img = frame.to_rgb().to_ndarray(width=244, height=244) # リサイズしながらRGBへ変換
img = frame.to_ndarray(format="rgb24", width=244, height=244) # 同上
# 以下PILへの変換
img = frame.to_image() # PIL image
img = frame.to_image(width=244, height=244) # リサイズしながらPILへ変換
img.save(
"frames.{:04d}.jpg".format(i),
quality=80,
) # PIL imageの保存
if i > 5:
break
if __name__ == '__main__':
filenames = [
# long (untrimmed) videos
"zlVkeKC6Ha8.mp4", # AVA, mp4, h264/avc
"v_nHE7u40plD0.mkv", # ActivityNet, mkv, vp9
"001YG.mp4", # Charades, mp4, h264/avc
"-4wsuPCjDBc_5_15.avi", # MSVD, avi, h264
"s07-d72-cam-002.avi", # MPII Cooking 2, avi, msmpeg4v2/MP42
# short (trimmed) videos
"-3B32lodo2M_000059_000069.mp4", # Kinetics, mp4, h264/avc
"v_ApplyEyeMakeup_g01_c01.avi", # UCF101, avi, mpeg4/XVID
"April_09_brush_hair_u_nm_np1_ba_goo_1.avi", # HMDB, avi, mpeg4/DX50
"200000.webm", # SSv2, webm, vp9
]
for filename in filenames:
print("==================")
try:
load_from_file(filename, any_frame=False, backward=True)
except BaseException:
continue
以下実行結果.
AVA.mp4のh264/acvなので素直.
filename: zlVkeKC6Ha8.mp4
filesize [kB]: 224274
filesize [MB]: 219
bit_rate [kb/s]: 637.2333984375
container duration [sec]: 2815.605
stream duration: 253392000
frames: 84464
guessed frames: 84468
container format name: mov,mp4,m4a,3gp,3g2,mj2
container format long name: QuickTime / MOV
codec name: h264
codec long name: H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
codec tag: avc1
container metadata major_brand: isom
container metadata minor_version: 512
container metadata compatible_brands: isomiso2avc1mp41
container metadata encoder: Lavf56.40.101
base_rate [fps]: 30
rate [fps]: 30
width [pix]: 640
height [pix]: 480
i: 0, time:0.000, pts:0, 640x480, yuv420p keyframe
i: 1, time:0.033, pts:3000, 640x480, yuv420p
i: 2, time:0.067, pts:6000, 640x480, yuv420p
i: 3, time:0.100, pts:9000, 640x480, yuv420p
i: 4, time:0.133, pts:12000, 640x480, yuv420p
i: 5, time:0.167, pts:15000, 640x480, yuv420p
i: 6, time:0.200, pts:18000, 640x480, yuv420p
ActivityNet.ほとんどはmp4のh264/acvだがたまにmkvでvp9が混在.
filename: v_nHE7u40plD0.mkv
filesize [kB]: 49262
filesize [MB]: 48
bit_rate [kb/s]: 2708.220703125
container duration [sec]: 145.519
stream duration: None
frames: 0
guessed frames: 7275
container format name: matroska,webm
container format long name: Matroska / WebM
codec name: vp9
codec long name: Google VP9
codec tag:
container metadata ENCODER: Lavf56.25.101
base_rate [fps]: 50
rate [fps]: 50
width [pix]: 1280
height [pix]: 720
i: 0, time:0.000, pts:0, 1280x720, yuv420p keyframe
i: 1, time:0.020, pts:20, 1280x720, yuv420p
i: 2, time:0.040, pts:40, 1280x720, yuv420p
i: 3, time:0.060, pts:60, 1280x720, yuv420p
i: 4, time:0.080, pts:80, 1280x720, yuv420p
i: 5, time:0.100, pts:100, 1280x720, yuv420p
i: 6, time:0.120, pts:120, 1280x720, yuv420p
Charadesはmp4でh264/avc
filename: 001YG.mp4
filesize [kB]: 1484
filesize [MB]: 1
bit_rate [kb/s]: 386.2177734375
container duration [sec]: 30.744
stream duration: 919919
frames: 919
guessed frames: 921
container format name: mov,mp4,m4a,3gp,3g2,mj2
container format long name: QuickTime / MOV
codec name: h264
codec long name: H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
codec tag: avc1
container metadata major_brand: isom
container metadata minor_version: 512
container metadata compatible_brands: isomiso2avc1mp41
container metadata encoder: Lavf56.40.101
base_rate [fps]: 30000/1001
rate [fps]: 30000/1001
width [pix]: 480
height [pix]: 270
i: 0, time:0.000, pts:0, 480x270, yuv420p keyframe
i: 1, time:0.033, pts:1001, 480x270, yuv420p
i: 2, time:0.067, pts:2002, 480x270, yuv420p
i: 3, time:0.100, pts:3003, 480x270, yuv420p
i: 4, time:0.133, pts:4004, 480x270, yuv420p
i: 5, time:0.167, pts:5005, 480x270, yuv420p
i: 6, time:0.200, pts:6006, 480x270, yuv420p
MSVDはaviでh264/avc
filename: -4wsuPCjDBc_5_15.avi
filesize [kB]: 679
bit_rate [kb/s]: 541.75
container duration [sec]: 10.027743
stream duration: 300
frames: 300
guessed frames: 299
container format name: avi
container format long name: AVI (Audio Video Interleaved)
codec name: h264
codec long name: H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
codec tag: H264
container metadata software: MEncoder SVN-r33477-4.2.1
base_rate [fps]: 359/12
rate [fps]: 29917/1000
width [pix]: 480
height [pix]: 360
i: 0, time:1.136, pts:34, 480x360, yuv420p keyframe
i: 1, time:1.170, pts:35, 480x360, yuv420p
i: 2, time:1.203, pts:36, 480x360, yuv420p
i: 3, time:1.237, pts:37, 480x360, yuv420p
i: 4, time:1.270, pts:38, 480x360, yuv420p
i: 5, time:1.304, pts:39, 480x360, yuv420p
i: 6, time:1.337, pts:40, 480x360, yuv420p
MPII Cooking 2はaviでMPEG4(h264ではない)
filename: s07-d72-cam-002.avi
filesize [kB]: 19374
filesize [MB]: 18
bit_rate [kb/s]: 2460.560546875
container duration [sec]: 62.993197
stream duration: 1852
frames: 1852
guessed frames: 1851
container format name: avi
container format long name: AVI (Audio Video Interleaved)
codec name: msmpeg4v2
codec long name: MPEG-4 part 2 Microsoft variant version 2
codec tag: MP42
container metadata software: MEncoder dev-SVN-r26940
base_rate [fps]: 147/5
rate [fps]: 147/5
width [pix]: 1624
height [pix]: 1224
i: 0, time:0.000, pts:0, 1624x1224, yuv420p keyframe
i: 1, time:0.034, pts:1, 1624x1224, yuv420p
i: 2, time:0.068, pts:2, 1624x1224, yuv420p
i: 3, time:0.102, pts:3, 1624x1224, yuv420p
i: 4, time:0.136, pts:4, 1624x1224, yuv420p
i: 5, time:0.170, pts:5, 1624x1224, yuv420p
i: 6, time:0.204, pts:6, 1624x1224, yuv420p
Kineticsはyoutube-dlだからmp4でh264/acv
filename: -3B32lodo2M_000059_000069.mp4
filesize [kB]: 1330
filesize [MB]: 1
bit_rate [kb/s]: 1063.935546875
container duration [sec]: 10.008
stream duration: 128000
frames: 250
guessed frames: 250
container format name: mov,mp4,m4a,3gp,3g2,mj2
container format long name: QuickTime / MOV
codec name: h264
codec long name: H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
codec tag: avc1
container metadata major_brand: isom
container metadata minor_version: 512
container metadata compatible_brands: isomiso2avc1mp41
container metadata encoder: Lavf57.41.100
base_rate [fps]: 25
rate [fps]: 25
width [pix]: 640
height [pix]: 360
i: 0, time:0.000, pts:0, 640x360, yuv420p keyframe
i: 1, time:0.040, pts:512, 640x360, yuv420p
i: 2, time:0.080, pts:1024, 640x360, yuv420p
i: 3, time:0.120, pts:1536, 640x360, yuv420p
i: 4, time:0.160, pts:2048, 640x360, yuv420p
i: 5, time:0.200, pts:2560, 640x360, yuv420p
i: 6, time:0.240, pts:3072, 640x360, yuv420p
UC101はAVIでMPEG4(XviD使用)
filename: v_ApplyEyeMakeup_g01_c01.avi
filesize [kB]: 287
bit_rate [kb/s]: 350.806640625
container duration [sec]: 6.56
stream duration: 164
frames: 164
guessed frames: 164
container format name: avi
container format long name: AVI (Audio Video Interleaved)
codec name: mpeg4
codec long name: MPEG-4 part 2
codec tag: XVID
container metadata software: MEncoder r33883
base_rate [fps]: 25
rate [fps]: 25
width [pix]: 320
height [pix]: 240
i: 0, time:0.040, pts:1, 320x240, yuv420p keyframe
i: 1, time:0.080, pts:2, 320x240, yuv420p
i: 2, time:0.160, pts:4, 320x240, yuv420p
i: 3, time:0.120, pts:3, 320x240, yuv420p
i: 4, time:0.200, pts:5, 320x240, yuv420p
i: 5, time:0.280, pts:7, 320x240, yuv420p
i: 6, time:0.240, pts:6, 320x240, yuv420p
HMDB51はaviでMPEG4(DivX使用)
filename: April_09_brush_hair_u_nm_np1_ba_goo_1.avi
filesize [kB]: 754
bit_rate [kb/s]: 458.4296875
container duration [sec]: 13.166667
stream duration: 395
frames: 395
guessed frames: 395
container format name: avi
container format long name: AVI (Audio Video Interleaved)
codec name: mpeg4
codec long name: MPEG-4 part 2
codec tag: DX50
base_rate [fps]: 30
rate [fps]: 30
width [pix]: 320
height [pix]: 240
i: 0, time:0.033, pts:1, 320x240, yuv420p keyframe
i: 1, time:0.100, pts:3, 320x240, yuv420p
i: 2, time:0.167, pts:5, 320x240, yuv420p
i: 3, time:0.133, pts:4, 320x240, yuv420p
i: 4, time:0.200, pts:6, 320x240, yuv420p
i: 5, time:0.267, pts:8, 320x240, yuv420p
i: 6, time:0.233, pts:7, 320x240, yuv420p
SSv2はwebm/vp9.(SSv1はjpeg frameだったが)
filename: 200000.webm
filesize [kB]: 146
bit_rate [kb/s]: 335.42578125
container duration [sec]: 3.5
stream duration: None
frames: 0
guessed frames: 42
container format name: matroska,webm
container format long name: Matroska / WebM
codec name: vp9
codec long name: Google VP9
codec tag:
container metadata title: (C) 2017 Twenty Billion Neurons GmbH, 20BN-Someting-Something-Dataset V2
container metadata encoder: Lavf56.40.101
base_rate [fps]: 12
rate [fps]: 12
width [pix]: 427
height [pix]: 240
i: 0, time:0.000, pts:0, 427x240, yuv420p keyframe
i: 1, time:0.083, pts:83, 427x240, yuv420p
i: 2, time:0.167, pts:167, 427x240, yuv420p
i: 3, time:0.250, pts:250, 427x240, yuv420p
i: 4, time:0.333, pts:333, 427x240, yuv420p
i: 5, time:0.417, pts:417, 427x240, yuv420p
i: 6, time:0.500, pts:500, 427x240, yuv420p