More than 3 years have passed since last update.

torchvision.transforms について

Posted at 2020-09-15

記事概要

torchvision.transforms の使い方メモです。なお、transforms.functional はカバーしていません。

`torchvision.transforms` とは

画像データの入力を加工してくれる callable なクラス群です。画像オブジェクトを渡すだけで前処理 (オーグメント/正規化等) を行ってくれる便利なクラスですが、PIL オブジェクトにのみ対応している transforms と torch.Tensor のみに対応している transforms が存在するため、適用順を間違えるとエラーが出ます。 どちらに対応したクラスかは公式のリファレンスに載っているため、使う前にチェックするといいと思います。基本的には PIL で操作して、正規化だけは torch.Tensor にすることを覚えておけばよいです。

また、detection や segmentation など ground truth に座標を含むようなタスクではこれらのクラスを利用できない (イメージの変換はできるが、座標の変換はできない) ため、自分でクラスを作る必要があります。

PIL のみに対応したクラス

サイズ系

CenterCrop
FiveCrop
Pad
RandomAffine
RandomCrop
RandomResizedCrop
Resize
TenCrop

色系

ColorJitter
Grayscale
RandomAffine
RandomGrayscale

反転系

RandomHorizontalFlip
RandomVerticalFlip
TenCrop

回転系

RandomPerspective
RandomRotation

汎用

RandomApply
RandomChoice
RandomOrder

torch.Tensor のみに対応したクラス

線形変換

LinearTransformation

正規化

Normalize

オーグメント

RandomErasing (https://arxiv.org/pdf/1708.04896.pdf)

その他

型変換

ToPILImage(numpy.ndarray or torch.Tensor -> PIL)
ToTensor (numpy.ndarray or PIL -> torch.Tensor)

汎用

Lambda

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up