More than 5 years have passed since last update.

OpenCV-Pythonで似たような画像の差分を検出する

Last updated at 2020-01-07Posted at 2019-12-31

やりたいこと

例えば違う組版エンジンでPDF化したOFFICEドキュメントの見栄えのチェックとか。
エビデンスになって、人が見てなんとなく分かるような画像が欲しい。
あと全部の画像見るのしんどいので違う度合いを数値で出して、違いの大きいやつだけ人が確認するとか。

用意するもの

Python3が動く環境
OpenCV-Python

非公式だけどOpenCVのPython環境があるのでそれをサクッといれちゃいます。
pip install opencv-python
依存関係にあるnumpyも一緒に入ります。

Pythonのコード

diff_img.py

import pathlib
import cv2
import numpy as np

source_dir = pathlib.Path('source_img')
source_files = source_dir.glob('*.*')
target_dir = pathlib.Path('target_img')
result_dir = pathlib.Path('result_img')
log_file = result_dir / pathlib.Path('result.log')
kernel = np.ones((3, 3), np.uint8)

fs = open(log_file, mode='w')
for source_file in source_files:
    source_img = cv2.imread(str(source_file))
    target_file = target_dir / source_file.name
    target_img = cv2.imread(str(target_file))
    if target_img is None:
        fs.write(target_file + '...skipped.\n')
        continue
    max_hight = max(source_img.shape[0], target_img.shape[0])
    max_width = max(source_img.shape[1], target_img.shape[1])

    temp_img = source_img
    source_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    source_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    temp_img = target_img
    target_img = np.zeros((max_hight, max_width, 3), dtype=np.uint8)
    target_img[0:temp_img.shape[0], 0:temp_img.shape[1]] = temp_img

    result_img = cv2.addWeighted(source_img, 0.5, target_img, 0.5, 0)

    source_img = cv2.cvtColor(source_img, cv2.COLOR_BGR2GRAY)
    target_img = cv2.cvtColor(target_img, cv2.COLOR_BGR2GRAY)
    img = cv2.absdiff(source_img, target_img)
    rtn, img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU)
    img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)

    contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE,
                                           cv2.CHAIN_APPROX_SIMPLE)
    result_img = cv2.drawContours(result_img, contours, -1, (0, 0, 255))
    score = 0
    for contour in contours:
        score += cv2.contourArea(contour)
    score /= max_hight * max_width
    fs.write(target_file.name + ', ' + str(score) + '\n')
    diff_file = result_dir / source_file.name
    cv2.imwrite(str(diff_file), result_img)
fs.close()

比較方法は、こちらの記事
[サイゼリヤの間違い探しが難しすぎたので大人の力で解決した]
(http://kawalabo.blogspot.com/2014/11/blog-post.html)
http://kawalabo.blogspot.com/2014/11/blog-post.html
を参考にしています。

スコアは単純に差分検出されたエリアの面積を出して画像全体の面積で割って、ファイル名と一緒にログに吐いてます。

result.log

test-1.png, 0.01231201710816777
test-2.png, 0.0084626793598234

2020年1月7日追記

比較する画像のサイズが違っても比較できるようにしました。

実行例

同じような表ですが比較対象の方が幅が少し小さくて余白が大きくなっていました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up