More than 3 years have passed since last update.

ffmpeg と OpenVINO で動画から顔画像を抽出する

Posted at 2020-10-04

前に試した OpenVINO の interactive_face_detection_demo と ffmpeg を連携して、動画から顔画像だけを抽出します。

1. 動画の用意

ファイル名 input.mp4 で、Windows 10 のダウンロードフォルダに置いてある想定です。

2. 動画を 1 fps に修正

interactive_face_detection_demo は全フレームに対して inference するので処理時間の短縮のため、また後述の ffmpeg で抽出する時に都合が良いため、input.mp4 を秒間 1 フレームに変換しておきます。

cd /cygdrive/c/Users/${USER}/Downloads/
ls -l input.mp4

mkdir -p output/
ffmpeg -i input.mp4 -r 1 output/input_r1.mp4 -y

3. OpenVINO の demo で顔検出

前に試したものとほぼ同じですが、-r オプションで raw アウトプットを取得しています。

echo 'source ${INTEL_OPENVINO_DIR}/bin/setupvars.sh

cd ${INTEL_CVSDK_DIR}/inference_engine/demos/
sed -i "s/*)/interactive_face_detection_demo)/g" CMakeLists.txt
./build_demos.sh

${INTEL_CVSDK_DIR}/deployment_tools/tools/model_downloader/downloader.py \
  --name face-detection-adas-0001 \
  --output_dir /content/model/ \
  --precisions FP32

echo `date`: start detection

/root/omz_demos_build/intel64/Release/interactive_face_detection_demo \
  -i /Downloads/output/input_r1.mp4 \
  -m /content/model/intel/face-detection-adas-0001/FP32/face-detection-adas-0001.xml \
  -no_show \
  -no_wait \
  -async \
  -r > /Downloads/output/raw.txt

echo `date`: end detection' | docker run -v /c/Users/${USER}/Downloads:/Downloads -u root -i --rm openvino/ubuntu18_dev:2020.4

4. raw.txt から ffmpeg で抽出

raw.txt

～～
[116,1] element, prob = 0.0198675    (-4,209)-(48,48)
[117,1] element, prob = 0.0198515    (444,146)-(68,68)
[0,1] element, prob = 0.999333    (222,115)-(205,205) WILL BE RENDERED!
[1,1] element, prob = 0.0601832    (405,393)-(94,94)
～～

raw.txt の中は上記のように、フレーム毎に顔に近い順に候補が出力され、顔っぽい候補 (評価値が 0.5 以上) には WILL BE RENDERED! がつきます。

THRESHOLD=0.9
perl -ne '$i++ if m{^\[0,1\]}; printf "ffmpeg -loglevel error -ss ".($i-1)." -i input_r1.mp4 -vframes 1 -vf crop=$4:$5:$2:$3 %05d.jpg -y\n", ++$j if m{([0-9.]+)\s+\((\d+),(\d+)\)-\((\d+),(\d+)\)} and $1 > '${THRESHOLD} raw.txt > ffmpeg.sh

[0,1] の数がフレーム数になるので、後で -ss オプションに渡します(秒間 1 フレーム動画なので、秒数を渡す -ss オプションにそのままフレーム数を渡せます)。顔っぽい評価値が高いものだけ (WILL BE RENDERED! は 0.5 以上ですが、結構ザルなので上記は高めにしています) 、座標から crop フィルタのパラメータを取得して ffmpeg コマンドを出力します (必ず正方形になるようです)。

ffmpeg.sh

ffmpeg -loglevel error -ss 32 -i input_r1.mp4 -vframes 1 -vf crop=36:36:109:178 00001.jpg -y
ffmpeg -loglevel error -ss 36 -i input_r1.mp4 -vframes 1 -vf crop=34:34:107:177 00002.jpg -y
ffmpeg -loglevel error -ss 37 -i input_r1.mp4 -vframes 1 -vf crop=32:32:108:178 00003.jpg -y
ffmpeg -loglevel error -ss 39 -i input_r1.mp4 -vframes 1 -vf crop=32:32:109:179 00004.jpg -y
ffmpeg -loglevel error -ss 40 -i input_r1.mp4 -vframes 1 -vf crop=37:37:97:178 00005.jpg -y
ffmpeg -loglevel error -ss 41 -i input_r1.mp4 -vframes 1 -vf crop=34:34:46:176 00006.jpg -y
ffmpeg -loglevel error -ss 44 -i input_r1.mp4 -vframes 1 -vf crop=64:64:552:236 00007.jpg -y

上記のようなコマンド群が書かれたファイルができるので、

sh ffmpeg.sh

シェルで実行することで、それぞれの顔画像が出力されます。

5. 必要に応じてリサイズ

mogrify -resize 128x128! *.jpg

ImageMagick の mogrify などで、サイズを揃えておくと便利そうです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up