More than 5 years have passed since last update.

Pythonで画像読み込み->リサイズ->RGB変換->正規化したnp配列で返す処理のBenchmark

Last updated at 2019-09-22Posted at 2019-09-22

PythonのOpenCVとPillowで掲題の処理に掛かる時間を計測して比較してみました。

結論

OpenCVのほうが2割ほど早い
- ただし、処理によってどちらが早いかは異なる可能性があるので、処理内容が異なる場合は計測してみるほうが良い。

環境

OS: macOS Mojave 10.14.6
CPU: Intel Core i5
Python: 3.7.3
cv2: 4.1.1
PIL: 6.1.0
JupyterLab（1.0.9）上にて実行

対象とする画像

現在、画像サイズが大きいのデータセットで学習していて気になって調べたという経緯なので、大きめのものを使いました。私のスマホでその場で適当に撮影したオフィスの天井です。
512KB、4160 × 3120、 RGBです。

コード


import numpy as np
import cv2
import PIL
from PIL import Image
from line_profiler import LineProfiler

image_path = './ceiling.jpg'
resize_h = 512
resize_w = 512

def cv2_bench(image_path, resize_h, resize_w):
    im = cv2.imread(image_path)
    resized = cv2.resize(im, (resize_h, resize_w))
    np_im = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB) / 255.
    return np_im

def pil_bench(image_path, resize_h, resize_w):
    im = Image.open(image_path)
    resized = im.resize((resize_w, resize_h))
    np_im = np.asarray(resized.convert('RGB')) / 255.
    return np_im

計測結果

`%%timeit`にて計測した結果

cv2

実行コード

%%timeit
cv2_bench(image_path, resize_h, resize_w)

結果

100 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PIL

実行コード

%%timeit
pil_bench(image_path, resize_h, resize_w)

結果

122 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

line_profilerを使った結果

cv2

実行コード

prof = LineProfiler()
prof.add_function(cv2_bench)
for _ in range(100):
    prof.runcall(cv2_bench, image_path, resize_h, resize_w)
prof.print_stats(output_unit=1e-6)

結果

Timer unit: 1e-06 s

Total time: 9.78859 s
File: <ipython-input-5-f3195f6f8fbd>
Function: cv2_bench at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def cv2_bench(image_path, resize_h, resize_w):
     2       100    9592910.0  95929.1     98.0      im = cv2.imread(image_path)
     3       100      93328.0    933.3      1.0      resized = cv2.resize(im, (resize_h, resize_w))
     4       100     102205.0   1022.0      1.0      np_im = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB) / 255.
     5       100        143.0      1.4      0.0      return np_im

PIL

実行コード


prof = LineProfiler()
prof.add_function(pil_bench)
for _ in range(100):
    prof.runcall(pil_bench, image_path, resize_h, resize_w)
prof.print_stats(output_unit=1e-6)

結果

Timer unit: 1e-06 s

Total time: 12.2011 s
File: <ipython-input-6-3d3ad2a2f650>
Function: pil_bench at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def pil_bench(image_path, resize_h, resize_w):
     2       100     119025.0   1190.2      1.0      im = Image.open(image_path)
     3       100   11924871.0 119248.7     97.7      resized = im.resize((resize_w, resize_h))
     4       100     157020.0   1570.2      1.3      np_im = np.asarray(resized.convert('RGB')) / 255.
     5       100        148.0      1.5      0.0      return np_im

まとめ

line_profilerの結果を見ると100回の実行で、2.4秒ほどOpenCVのほうが早い結果となりました。
10万回実行だと、単純計算で40分ほど早いということになります。
実際に学習等で使う場合は、複数プロセスでの実行やPillow-SIMDの利用なども検討すべきかと思いますが、ベースがわかってよかったと思います。

また余談ですが、今回の結果を見てPIL.Image.open()がlazy operationだと知りました。
参考： https://pillow.readthedocs.io/en/3.1.x/reference/Image.html

参考記事

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up