4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

【画像スタイル変換】Pytorchの画像スタイル変換を使って、画風変換動画を作る!

Posted at

#1 やりたいこと
画風変換とは、
DeepLeaningで、ある画像を、別の画像風にしてくれるものである。見た方が早い

「2018年版 深層学習によるスタイル変換まとめ」
https://qiita.com/ta-ka/items/b59286cdf9b4d9f9ff14

というような形で、スタイル画像をコンテンツ画像に合成させるようなイメージだ!
面白そう!

そしてやってみると意外と簡単にできて、

元画像(自分)

夕焼け(スタイル)

合成画像

さて、やりたいのは__動画__にしてみようということだ

#2 環境
割と重いので、__GoogleColaboratry__を利用することを強く推奨する。

基本的なコードはPytorch Tutorialのものを使い、実装していく。

利用するもの
Mac
--ffmpeg
--PIL
Colaboratry

#3 おおまかな手順

####動画像編集

####動画→画像

####画像→スタイル変換

####スタイル変換後→合成

#4 動画編集

まず、動画を短くしよう。imovieなどを利用して、
5秒くらいにするといい。

目安として、1秒の動画を作るのに、30分程度かかる
(1フレーム1分程度)
4秒の動画を作るのに2時間以上はかかった。

#5 動画→画像

ffmpeg -i input.mp4 -vcodec png image_%03d.png

により、動画を連番画像にできる。
image_001.png
image_002.png
...
image_125.png

次に、Pytorchのチュートリアルに突っ込める画像は正方形である必要がある。
(厳密にはおそらくスタイル画像と元画像のアスペクト比が同じ)

PILなどで任意の領域にクロップしてしまおう

from PIL import Image

#img file
pic_start = 1
pic_finish = 125

for i in np.arange(pic_start,pic_finish+1):
  print("\r{:}".format(i),end = "")

  #open image
  im = Image.open("image_{:0=3}.png".format(i))
  box = (0,200,0,200)#トリミング位置
  im = im.crop(box)
  im.save("image2_{:0=3}.png".format(i))

同様にスタイル画像も正方形にしておきます。(やったことにしておきます)
style.jpg

#6 画像→スタイル変換
さて、全てをcropというフォルダに突っ込み、
google driveにフォルダごとぶち込みます。

google driveはcolaboratoryにデータをマウントするのが簡単だからです。
さて、

google driveの新規の(その他の中の)colaboratoryを起動しましょう
そして
上のバーの「ランタイム」→「ランタイムのタイプの変更」→「None → GPU」に変更します。

さて、コードは以下のページにあるものを基本的に利用していますが、for文を回すために順番を入れ替えて実装します。
https://pytorch.org/tutorials/advanced/neural_style_tutorial.html

まず、google driveをcolaboratryにマウントしましょう。
一番上に付け足します。

from google.colab import drive
drive.mount('/content/drive')
%cd drive/"My Drive"/crop

次にクラスや関数の定義をまとめてやっちゃいます。

from __future__ import print_function

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

from PIL import Image
import matplotlib.pyplot as plt

import torchvision.transforms as transforms
import torchvision.models as models

import copy

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

import numpy as np
class ContentLoss(nn.Module):

    def __init__(self, target,):
        super(ContentLoss, self).__init__()
        # we 'detach' the target content from the tree used
        # to dynamically compute the gradient: this is a stated value,
        # not a variable. Otherwise the forward method of the criterion
        # will throw an error.
        self.target = target.detach()

    def forward(self, input):
        self.loss = F.mse_loss(input, self.target)
        return input
      
class StyleLoss(nn.Module):

    def __init__(self, target_feature):
        super(StyleLoss, self).__init__()
        self.target = gram_matrix(target_feature).detach()

    def forward(self, input):
        G = gram_matrix(input)
        self.loss = F.mse_loss(G, self.target)
        return input
      
def gram_matrix(input):
    a, b, c, d = input.size()  # a=batch size(=1)
    # b=number of feature maps
    # (c,d)=dimensions of a f. map (N=c*d)

    features = input.view(a * b, c * d)  # resise F_XL into \hat F_XL

    G = torch.mm(features, features.t())  # compute the gram product

    # we 'normalize' the values of the gram matrix
    # by dividing by the number of element in each feature maps.
    return G.div(a * b * c * d)
  
  
class Normalization(nn.Module):
    def __init__(self, mean, std):
        super(Normalization, self).__init__()
        # .view the mean and std to make them [C x 1 x 1] so that they can
        # directly work with image Tensor of shape [B x C x H x W].
        # B is batch size. C is number of channels. H is height and W is width.
        self.mean = torch.tensor(mean).view(-1, 1, 1)
        self.std = torch.tensor(std).view(-1, 1, 1)

    def forward(self, img):
        # normalize img
        return (img - self.mean) / self.std  

次にfor文を使って全てのimageを順番に学習にかけていき保存していきます。

pic_start = 1
pic_finish = 125

for ite in np.arange(pic_start,pic_finish+1):
# desired size of the output image
    imsize = 512 if torch.cuda.is_available() else 128  # use small size if no gpu

    loader = transforms.Compose([
        transforms.Resize(imsize),  # scale imported image
        transforms.ToTensor()])  # transform it into a torch tensor


    def image_loader(image_name):
        image = Image.open(image_name)
        # fake batch dimension required to fit network's input dimensions
        image = loader(image).unsqueeze(0)
        return image.to(device, torch.float)


    content_img = image_loader("image_{:0=3}.png".format(ite))
    style_img = image_loader("redsky_style.jpg")

    assert style_img.size() == content_img.size(), \
        "we need to import style and content images of the same size"

    unloader = transforms.ToPILImage()  # reconvert into PIL image

    plt.ion()

    def imshow(tensor, title=None):
        image = tensor.cpu().clone()  # we clone the tensor to not do changes on it
        image = image.squeeze(0)      # remove the fake batch dimension
        image = unloader(image)
        plt.imshow(image)
        if title is not None:
            plt.title(title)
        plt.pause(0.001) # pause a bit so that plots are updated


    plt.figure()
    imshow(style_img, title='Style Image')

    plt.figure()
    imshow(content_img, title='Content Image')

    cnn = models.vgg19(pretrained=True).features.to(device).eval()

    cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406]).to(device)
    cnn_normalization_std = torch.tensor([0.229, 0.224, 0.225]).to(device)

    # desired depth layers to compute style/content losses :
    content_layers_default = ['conv_4']
    style_layers_default = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']

    def get_style_model_and_losses(cnn, normalization_mean, normalization_std,
                                   style_img, content_img,
                                   content_layers=content_layers_default,
                                   style_layers=style_layers_default):
        cnn = copy.deepcopy(cnn)

        # normalization module
        normalization = Normalization(normalization_mean, normalization_std).to(device)

        # just in order to have an iterable access to or list of content/syle
        # losses
        content_losses = []
        style_losses = []

        # assuming that cnn is a nn.Sequential, so we make a new nn.Sequential
        # to put in modules that are supposed to be activated sequentially
        model = nn.Sequential(normalization)

        i = 0  # increment every time we see a conv
        for layer in cnn.children():
            if isinstance(layer, nn.Conv2d):
                i += 1
                name = 'conv_{}'.format(i)
            elif isinstance(layer, nn.ReLU):
                name = 'relu_{}'.format(i)
                # The in-place version doesn't play very nicely with the ContentLoss
                # and StyleLoss we insert below. So we replace with out-of-place
                # ones here.
                layer = nn.ReLU(inplace=False)
            elif isinstance(layer, nn.MaxPool2d):
                name = 'pool_{}'.format(i)
            elif isinstance(layer, nn.BatchNorm2d):
                name = 'bn_{}'.format(i)
            else:
                raise RuntimeError('Unrecognized layer: {}'.format(layer.__class__.__name__))

            model.add_module(name, layer)

            if name in content_layers:
                # add content loss:
                target = model(content_img).detach()
                content_loss = ContentLoss(target)
                model.add_module("content_loss_{}".format(i), content_loss)
                content_losses.append(content_loss)

            if name in style_layers:
                # add style loss:
                target_feature = model(style_img).detach()
                style_loss = StyleLoss(target_feature)
                model.add_module("style_loss_{}".format(i), style_loss)
                style_losses.append(style_loss)

        # now we trim off the layers after the last content and style losses
        for i in range(len(model) - 1, -1, -1):
            if isinstance(model[i], ContentLoss) or isinstance(model[i], StyleLoss):
                break

        model = model[:(i + 1)]

        return model, style_losses, content_losses

    input_img = content_img.clone()
    # if you want to use white noise instead uncomment the below line:
    # input_img = torch.randn(content_img.data.size(), device=device)

    # add the original input image to the figure:
    plt.figure()
    imshow(input_img, title='Input Image')

    def get_input_optimizer(input_img):
        # this line to show that input is a parameter that requires a gradient
        optimizer = optim.LBFGS([input_img.requires_grad_()])
        return optimizer

    def run_style_transfer(cnn, normalization_mean, normalization_std,
                           content_img, style_img, input_img, num_steps=300,
                           style_weight=1000000, content_weight=1):
        """Run the style transfer."""
        print('Building the style transfer model..')
        model, style_losses, content_losses = get_style_model_and_losses(cnn,
            normalization_mean, normalization_std, style_img, content_img)
        optimizer = get_input_optimizer(input_img)

        print('Optimizing..')
        run = [0]
        while run[0] <= num_steps:

            def closure():
                # correct the values of updated input image
                input_img.data.clamp_(0, 1)

                optimizer.zero_grad()
                model(input_img)
                style_score = 0
                content_score = 0

                for sl in style_losses:
                    style_score += sl.loss
                for cl in content_losses:
                    content_score += cl.loss

                style_score *= style_weight
                content_score *= content_weight

                loss = style_score + content_score
                loss.backward()

                run[0] += 1
                if run[0] % 50 == 0:
                    print("run {}:".format(run))
                    print('Style Loss : {:4f} Content Loss: {:4f}'.format(
                        style_score.item(), content_score.item()))
                    print()

                return style_score + content_score

            optimizer.step(closure)

        # a last correction...
        input_img.data.clamp_(0, 1)

        return input_img

    output = run_style_transfer(cnn, cnn_normalization_mean, cnn_normalization_std,
                                content_img, style_img, input_img)

    plt.figure()
    imshow(output, title='Output Image')

    # sphinx_gallery_thumbnail_number = 4
    plt.ioff()
    plt.show()


    out = output.cpu()
    out = out.data.numpy()

    out = out.reshape(3,512,512)

    print(out.shape)
    np.max(out[0])
    out[0] = out[0]*255
    out[1] = out[1]*255
    out[2] = out[2]*255
    out = out.astype(np.uint8).transpose(1,2,0)

    from PIL import Image
    im = Image.fromarray(out)
    im.save("image2_{:0=3}.png".format(ite))

長いので後から説明になりますが、

for ite in np.arange():
    ...
    content_img = image_loader("image_{:0=3}.png".format(ite))
    style_img = image_loader("style.jpg")

ここで
順番に画像を読み出します。

そして最後の部分はpytorchの出力をPILに渡して
image2_001.png
image2_002.png
...
image2_125.png
と連番で保存していきます。

    out = output.cpu()
    out = out.data.numpy()

    out = out.reshape(3,512,512)

    print(out.shape)
    np.max(out[0])
    out[0] = out[0]*255
    out[1] = out[1]*255
    out[2] = out[2]*255
    out = out.astype(np.uint8).transpose(1,2,0)

    im = Image.fromarray(out)
    im.save("image2_{:0=3}.png".format(ite))

#7 スタイル連番画像→動画

google driveに保存されたimage2の連番画像をローカルに落とし、(colabでやってもいい)以下のコマンド

ffmpeg -r 30 -i image2_%03d.png -vcodec libx264 -pix_fmt yuv420p -r 30 out.mp4

によりout.mp4を出力できます。

4
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?