More than 1 year has passed since last update.

Pyramid Poolingを解説

Last updated at 2023-06-23Posted at 2023-06-23

原論文
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
https://arxiv.org/abs/1406.4729

概要

画像認識分野においてConvolutional Neural Network（CNN）は様々なタスクにおいて高性能を実現した．
CNNの構造は特徴量の少量域に重みカーネルを通す畳み込み処理と近傍の情報を統合するPooling処理がある．今回は，Pooling処理構造を変更したPyramid Poolingについて解説する．

構造

入力された特徴量を指定の数に分割する．その分割した範囲でpoolingを行う．例えば，右の図の16つに分割した場合では，それぞれの範囲の値が1つの値に集約されるので，出力は$4 \times 4 \times d$になる．
様々な数で分割してpoolingした値は1次元化されて，コンキャットする．

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up