More than 1 year has passed since last update.

【Semantic Segmentation】DeepLab(v3+) : DeepLab(v3)との違いは？

Last updated at 2024-05-19Posted at 2020-11-28

背景

[DeepLab(v1)](https://qiita.com/minh33/items/8eb31d16a975d2a87de5),[DeepLab(v2)](https://qiita.com/minh33/items/23030207de240edecc15),[DeepLab(v3)](https://qiita.com/minh33/items/deb7765ca064f1b0477b)を調べたから最後にDeepLab(v3+)を要約しupdateについて比較していきたいと思う。全部読むの大変だった(T_T)

version3からのupdate

![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/482094/52e65f12-b4ce-6d9b-35f1-e2e91cc2383e.png)

DeepLabv3では(a)のEncoderのみで1/8のサイズのSemantic Mapを8倍にinterpolationして最終出力としていた。

しかし、それだと1/8の粗さの精度しか出ないので、Decoderを一部取り入れる事にしたらしい。

MobileNetで出てきたDepthwise、Pointwise Conv、空間方向とChannel方向に分けて2回畳み込みしたほうが、普通のConvolutionより計算量が小さくなる。
詳細はこのあたりを参考にしてみてください
https://qiita.com/omiita/items/77dadd5a7b16a104df83

Dilated(Atrous) Convにも同じものを適用して計算量を下げた。

もう一つ違いはResNetをXceptionに変えた事。

結論

新たなupdateは以下の3つ・Decoderの追加・Depthwise Separable ConvolutionをDilated(Atrous) Convolutionに応用した・ResNet->Xception

DeepLabV1からDeepLabV3+まで追ってみたが、Dilated(atrous) Convを使って効率的にGlobal Featureを取りに行くかについてがこの研究のメインテーマであったと感じた。

参考文献

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation https://arxiv.org/pdf/1802.02611.pdf

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up