More than 3 years have passed since last update.

TensorFlowからjpeg画像を読み込むとほかのモジュールと品質が変わるお話

Last updated at 2021-09-10Posted at 2021-09-03

はじめに

Tensorflowには画像を読み込む関数があります。例えば以下のコードです。

import tensorflow as tf

img_path = 'hogehoge.jpg'
raw = tf.io.read_file(img_path)
img = tf.image.decode_image(raw, channels=3)

以前、研究の中でcv2からtensorflowに変えて前処理をしていたところ、この関数をつかうとなぜか結果が変わっていました。
検証したところ、どうやらjpegの品質が変わるみたいです。(2021/09/11追記: 理由については、最後の追記: 理由を見ていただければと思います。)

というわけで、ほかのモジュールではどうなるか比較してみます。対象はjpegです。尚pngでは品質の変化はありません。

比較

使うモジュールは以下の通りです:

Tensorflow
OpenCV2
Numpy(fromfile経由)
PIL
imageio
matplotlib

バージョン:

tensorflow              2.6.0
opencv-contrib-python   4.5.3.56
numpy                   1.19.5
Pillow                  8.3.1
imageio                 2.9.0
matplotlib              3.4.3

検証コード:

開く

import cv2
import numpy as np
import tensorflow as tf
import imageio
from PIL import Image
import matplotlib.pyplot as plt

img_path = 'peko_ra.jpg'

class Images:
    def __init__(self, img_path) -> None:
        self.img_path = img_path
    
    def using_tensorflow(self):
        raw = tf.io.read_file(self.img_path)
        img_tf = tf.image.decode_image(raw, channels=3)
        img_tf = tf.reverse(img_tf, axis=[-1]).numpy()
        return img_tf

    def using_cv2(self):
        return cv2.imread(self.img_path, cv2.IMREAD_COLOR)

    def using_pil(self):
        img = np.array(Image.open(self.img_path))[...,::-1]
        return img
    
    def using_imageio(self):
        img = imageio.imread(self.img_path)[...,::-1]
        return img
    
    def using_matplitlib(self):
        img = plt.imread(self.img_path)[...,::-1]
        return img
    
    def using_numpy(self):
        raw = np.fromfile(self.img_path, np.uint8)
        return cv2.imdecode(raw, cv2.IMREAD_COLOR)
    
    def make_patterns(self):
        functions = [self.using_tensorflow, self.using_cv2, self.using_pil, self.using_imageio, self.using_matplitlib, self.using_numpy]
        rets = []
        for i, f in enumerate(functions):
            if len(functions[i+1:]) < 2:
                break
            for _f in functions[i+1:]:
                rets.append([f, _f, f.__name__, _f.__name__])
        return rets
    
    def check_diffs(self):
        x = self.make_patterns()
        for _x in x:
            fn_1, fn_2, fn_1_name, fn_2_name = _x
            msg_tmp = f'"{fn_1_name}" and "{fn_2_name}":'
            img_1 = fn_1().astype(np.int16)
            img_2 = fn_2().astype(np.int16)

            uniques = np.unique(img_1 - img_2)
            if len(uniques) == 1:
                print(msg_tmp, 'same quality.')
                continue
            print(msg_tmp, uniques)
            img_diffs = np.float32(img_1 - img_2)
            img_diffs = (img_diffs - img_diffs.min())/(img_diffs.max() - img_diffs.min())*255
            cv2.imwrite(f'diffs_{fn_1_name}_{fn_2_name}.png', img_diffs.astype(np.uint8))

Images(img_path).check_diffs()

今回検証に使った画像はホロぐら中の兎田ぺこらのシーンを使いました。ここに画像は張るのはアレなので、youtubeリンクだけ(12秒あたり)掲載します。~~なんか使いたかった。このシーン好き。~~

品質の有無については、読み込んだ結果の配列について差分を取り、np.uniqueでのuniqueな要素の数で確認しました。
もし要素が2以上であれば有、1であれば無とします。

結果

次のようになりました。

	OpenCV2	Numpy	PIL	imageio	matplotlib
Tensorflow	有	有	有	有	有
OpenCV2		無	無	無	無
Numpy			無	無	無
PIL				無	無
imageio					無

Tensorflowは他のモジュールとの品質が全て異なっており、uniqueな要素はいずれも[-5 -4 -3 -2 -1 0 1 2 3]でした。一方で、他のモジュール同士の場合は、まったく同じ結果が得られました。

試しに、Tensorflowとcv2の差分画像(tensorflowから読み込んだ画像 - cv2から読み込んだ画像)を取ってみます。
分かりやすいように、差分画像の最小値と最大値を[0,255]のスケールに直しました。

元の画像の形が見えてしまっていますね。

おわりに

tensorflowからjpegを読み込むのと、ほかのモジュールでjpegを読み込むのとで、品質が変化しました。ただ、誤差は非常に少ないため、そこまでデータの品質にこだわる必要がなければ、この記事は「ほーん」程度でみていただければokです。理由については直下、またはyoyaさんのコメントを参照してください。

追記: 理由について

コメント欄で指摘されたように、tf.image.decode_imageまたはtf.image.decode_jpegのデフォルト値からjpegの画像を読み込んだ場合において今回の検証結果のように品質が落ちます。

tensorflowは基本的にpywrap_tfeから関数のOpを指定して呼び出しています(tensorflow/python/ops/gen_xxx_ops.py、image系はgen_image_ops.pyを参照)。

`tf.image.decode_image`->`gen_image_ops.decode_image`

    ...
    try:
      _result = pywrap_tfe.TFE_Py_FastPathExecute(
        _ctx, "DecodeImage", name, contents, "channels", channels, "dtype",
        dtype, "expand_animations", expand_animations)
      return _result
    ...

`tf.image.decode_jpeg`->`gen_image_ops.decode_image`

    ...
    try:
      _result = pywrap_tfe.TFE_Py_FastPathExecute(
        _ctx, "DecodeJpeg", name, contents, "channels", channels, "ratio",
        ratio, "fancy_upscaling", fancy_upscaling, "try_recover_truncated",
        try_recover_truncated, "acceptable_fraction", acceptable_fraction,
        "dct_method", dct_method)
      return _result
    ...

tf.image.decode_jpegのOpはDecodeJpeg、tf.image.decode_imageはDecodeImageなので、公式githubからこのOp名を探すと、decode_image_op.ccより、以下の記載がなされています。

    if (op_type_ == "DecodeJpeg" || op_type_ == "DecodeAndCropJpeg") {
      # ...
      // The TensorFlow-chosen default for JPEG decoding is IFAST, sacrificing
      // image quality for speed.
      if (dct_method.empty() || dct_method == "INTEGER_FAST") {
        flags_.dct_method = JDCT_IFAST;
      } else if (dct_method == "INTEGER_ACCURATE") {
        flags_.dct_method = JDCT_ISLOW;
      }
    } else {
      flags_ = jpeg::UncompressFlags();
      flags_.dct_method = JDCT_IFAST;
    }
    # ...

pythonの関数に戻ると、tf.image.decode_jpegでのdct_methodのデフォルト値は''です。これはdct_method.empty()に該当するので、デフォルトでのdct_methodは上記のコードのコメントにあるように画質を犠牲に高速で読み込む手法になります。DecodeImageは上記のIf分からは外れますが、その下のelseにもあるようにdct_methodはデフォルトでINTEGER_FASTになるわけです。

もしjpeg画像について、品質をほかのモジュールと同じにしたい場合はコードを次にすると治ります。

img = tf.image.decode_jpeg(raw, channels=3, dct_method="INTEGER_ACCURATE")

tf.image.decode_imageからは上記のように引数にそのパラメータがないため、python側からは変更することはできません。

yoyaさん、ありがとうございました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up