More than 3 years have passed since last update.

PyTorchベースの画像処理ライブラリ「Kornia」の使い方

Posted at 2020-06-20

はじめに

普段、深層学習のフレームワークとしてはPyTorchをよく使っていますが、そのPyTorchがベースとなっている「Kornia」という画像処理ライブラリについて最近知りました。
そこで基本的な機能や使い方を調べてみたので、備忘録として残しておきます。

Korniaとは

Korniaは、PyTorchをバックエンドとして実装されているオープンソースのコンピュータービジョンライブラリです。
(Kornia GitHub)

It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions.

フィルタリング処理、色変換、幾何学変換など、OpenCVに近い低レベルの画像処理演算が実装されています。
そしてPyTorchがバックエンドであるため、GPUサポートや自動微分の恩恵を受けやすいというメリットがあります。

インストールと基本的な使い方

インストールはREADMEにあるようにpipなどでできます。(この場合PyTorchは自動で入ります)
pip install kornia
※2020/06/20現在だと、kornia 0.3.1 / pytorch 1.5.1がインストールされました。

また、チュートリアルの各種を実行する際には、OpenCV, matplotlib, torchvisionも必要になります。

基本的な使い方の例として、画像にGaussian Blurをかける処理は以下のようになります。

import kornia
import cv2

# OpenCVで画像読み込み
img_src = cv2.imread('./data/lena.jpg')
img_src = cv2.cvtColor(img_src, cv2.COLOR_BGR2RGB)

# torch.Tensorに変換
tensor_src = kornia.image_to_tensor(img_src, keepdim=False).float() # 1xCxHxW

# Gaussian Blur
gauss = kornia.filters.GaussianBlur2d((11, 11), (10.5, 10.5))
tensor_blur = gauss(tensor_src)

# OpenCV(numpy.ndarray)画像に戻す
img_blur = kornia.tensor_to_image(tensor_blur.byte())

# --> show [img_src | img_blur]

このように、torch.Tensorを対象として目的の処理を行います。 (ちなみに、kornia.filters.GaussianBlur2dはtorch.nn.Moduleを継承しています)

その他の画像処理の例

上記以外のぼかし処理や色変化の一例を以下に示します。

# Box Blur
tensor_blur = kornia.box_blur(tensor_src, (9, 9))

# Median Blur
tensor_blur = kornia.median_blur(tensor_src, (5, 5))

# Adjust Brightness
tensor_brightness = kornia.adjust_brightness(tensor_src, 0.6)

# Adjust Contrast
tensor_contrast = kornia.adjust_contrast(tensor_src, 0.2)

# Adjust Gamma
tensor_gamma = kornia.adjust_gamma(tensor_src, gamma=3., gain=1.5)

# Adjust Saturation
tensor_saturated = kornia.adjust_saturation(tensor_src, 0.2)

# Adjust Hue
tensor_hue = kornia.adjust_hue(tensor_src, 0.5)

torch.nn.Sequentialとの組み合わせ

上で書いたような処理をnn.Sequentialでまとめることで、画像の下処理をすっきりと書くことができます。以下はその例です。
また、ここではGPUが使える環境の前提で、GPU上で処理を行っています。

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import kornia

class DummyDataset(Dataset):
    def __init__(self):
        self.data_index = range(100)

    def __len__(self):
        return len(self.data_index)

    def __getitem__(self, idx):
        # generate dummy image and label
        image = torch.rand(3, 240, 320)
        label = torch.randint(5, (1,))
        return image, label

device = torch.device('cuda')

dataset = DummyDataset()
loader = DataLoader(dataset, batch_size=16, shuffle=True)

transform = nn.Sequential(
    kornia.color.AdjustSaturation(0.2),
    kornia.color.AdjustBrightness(0.5),
    kornia.color.AdjustContrast(0.7),
)

for i, (images, labels) in enumerate(loader):
    print(f'iter: {i}, images: {images.shape}, labels: {labels.shape}')

    images = images.to(device) # --> GPUへ
    images_tr = transform(images) # 画像にtransformを適用

    # training etc ...

自動微分を使う例

PyTorchの自動微分を使う例として、チュートリアルのtotal_variation_denoising.py（全変動ノイズ除去）から一部を引用します。

total_variation_denoising.py

# read the image with OpenCV
img: np.ndarray = cv2.imread('./data/doraemon.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255.0
img = img + np.random.normal(loc=0.0, scale=0.1, size=img.shape)
img = np.clip(img, 0.0, 1.0)

# convert to torch tensor
noisy_image: torch.tensor = kornia.image_to_tensor(img).squeeze()  # CxHxW

# define the total variation denoising network
class TVDenoise(torch.nn.Module):
   def __init__(self, noisy_image):
       super(TVDenoise, self).__init__()
       self.l2_term = torch.nn.MSELoss(reduction='mean')
       self.regularization_term = kornia.losses.TotalVariation()
       # create the variable which will be optimized to produce the noise free image
       self.clean_image = torch.nn.Parameter(data=noisy_image.clone(), requires_grad=True)
       self.noisy_image = noisy_image

   def forward(self):
       return self.l2_term(self.clean_image, self.noisy_image) + 0.0001 * self.regularization_term(self.clean_image)

   def get_clean_image(self):
       return self.clean_image

tv_denoiser = TVDenoise(noisy_image)

# define the optimizer to optimize the 1 parameter of tv_denoiser
optimizer = torch.optim.SGD(tv_denoiser.parameters(), lr=0.1, momentum=0.9)

# run the optimization loop
num_iters = 500
for i in range(num_iters):
   optimizer.zero_grad()
   loss = tv_denoiser()
   if i % 25 == 0:
       print("Loss in iteration {} of {}: {:.3f}".format(i, num_iters, loss.item()))
   loss.backward()
   optimizer.step()

# convert back to numpy
img_clean: np.ndarray = kornia.tensor_to_image(tv_denoiser.get_clean_image())

ここでは、noisy_imageをtorch.nn.Parameter()に渡し、clean_imageの初期状態としています。(これがoptimizerによる更新対象)
また、正則化項としてkorniaのTotalVariation()を利用しています。
※Total Variation(全変動)正則化項は隣り合うピクセルの差が大きい場合の罰則項となり、ノイズ成分を除去して画像を滑らかにする働きがあります。

まとめ

Pytorchベースの画像処理ライブラリKorniaについて、チュートリアルを中心に使い方を調査しました。
今回書いた以外にも、便利な機能が色々ありそうです。
画像の下処理だけでなく、ニューラルネットワークのforward内でも使えるので、普通のtorch/torchvisionだけで対応できない処理を加えたい時に便利そうかもと思いました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up