More than 5 years have passed since last update.

[Review] What "Convolution" means mathematically?

Last updated at 2018-05-02Posted at 2018-05-02

Preface

In this article, I would like to talk about the meaning of convolutional operation in CNN. I DO expect that you have certain understanding of CNN already.

References

畳み込みとは何か？ http://www.ice.tohtech.ac.jp/~nakagawa/laplacetrans/convolution1.htm
畳み込みによる画像処理とは？ https://www.clg.niigata-u.ac.jp/~medimg/practice_medical_imaging/imgproc_scion/4filter/index.htm
Cross-Correlation: http://mathworld.wolfram.com/Cross-Correlation.html
Cross-Correlation vs Convolution: https://www.youtube.com/watch?v=C3EEy8adxvc

Signal and Image

Generally speaking, we can say that signals build the foundation of any images. As you can see below. we can map the luminance of image on 3D region.
And the signal is able to be projected onto 2/3D region as well.

[Reference](https://www.jstage.jst.go.jp/article/itej/67/1/67_36/_pdf)

Also, images are compositions of integers in each pixel.

Reference

Hence, we can borrow some arithmetic operations from signal processing to here, image processing.
And one of the most obvious one is convolution, which I aim at describing in this article.
Before facing the actual math operation, however, I would like to confirm the similarity of the meaning of the convolution process in signal processing and image processing.

Convolution in signal processing (Conceptual Explanation)

In signal processing, we are using continuous version of convolutional approach as below.

reference
By this approach, we could blend two functions $f(t)$ and $g(x-t)$.

reference

good reference in japanese: http://www.yukisako.xyz/entry/tatamikomi

Convolution in image processing

In image processing, as we can imagine, the image is basically discrete area.
Discrete area means, it has edges. Hence in image processing, we normally use discrete version of convolution process as below.

[reference](http://mathtrain.jp/tatami)

With this approach, we can blend the image(feature map) and the weight matrix(filter). So that, the output of this math operation is partially trimmed feature map. And as you know, we apply this to each colour channel, like yellow, blue and red and so on. So the output of convolutional layer becomes the trimmed feature map in each colour channel. Note that due to the weight share rule in CNN, feature maps in channels will look similar.

So far we have seen the relationship between image processing and signal processing. But if we investigate more about convolution process, we will encounter the cross-correlation. Which is similar operation of convolution.

Cross-Correlation

Convolution

The brightest video I have confirmed ever: https://www.youtube.com/watch?v=MQm6ZP1F6ms

So what they do is fundamentally analogous. But the order to traverse is completely opposite.

Conclusion

By convolution operation, a image is multiplied a weight matrix with regard to each area containing pixels. And this gives us the smoothed version of images.

Thank you.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up