More than 1 year has passed since last update.

torchvision の Resize オプション

Posted at 2022-10-13

Resize オプション

torchvision の resize には interpolation や antialias といったオプションが存在する.
通常あまり意識しないでも問題は生じないが、ファインチューニングなどで backbone の学習をあらためて行わない場合には影響が起きることがある.

以上のサンプル画像に対して、オプションを変更すると画像は以下のように変化する.

# 左上 (default):
transforms.resize(x, size, interpolation=transforms.InterpolationMode.BILINEAR)
# 右上: 
transforms.resize(x, size, interpolation=transforms.InterpolationMode.BILINEAR, antialias=True)
# 左下:
transforms.resize(x, size, interpolation=transforms.InterpolationMode.BICUBIC)
# 右下:
transforms.resize(x, size, interpolation=transforms.InterpolationMode.BICUBIC, antialias=True)

BILINEARおよびBICUBICではかなりジャギーを生じていることがわかる.
BILINEAR+Antialias と BICUBIC+Antialias では大きな差はなくジャギーも少ない.
torchvisionには以下の記載がある.

when downsampling, the interpolation of PIL images and tensors is slightly different, because PIL applies antialiasing.

Data AugmentationをPILやnumpy objectの状態で行う場合は気にしなくてもよいと考えられる.
逆に言えば、Tensorで実行する場合は気にすべきである.
backbone モデルを再学習しない場合などのユースケースにおいては
元々の学習の設定方法と合わせておくことで無駄な性能劣化を回避できる.

参考文献

torchvision.transforms.Resize
データは Google Landmark V2 Dataset から引用

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up