More than 3 years have passed since last update.

【3D OD LiDAR編】3DSSD : point sampling&gropingで物体の候補を提案

Posted at 2020-12-30

3DSSD: Point-based 3D Single Stage Object Detector

![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/482094/3310868c-69a5-afcb-94e4-1f8a07e91ac5.png)

従来の研究ではFeature Propagation(点群から特徴量を取り出すネットワーク pointnet++など)、と2stageのrefinementが全体の半分以上の計算時間を使っているとした。だから2つのModuleを取り除き1st stageの高速のネットワークを提案するのがこの論文の趣旨。

アルゴリズム

BackBone

1. Pointをsamplingする 2. 選ばれなかったpointは選ばれたpointの中から一番近い物にGroupingする 3. MLPでGroupingされたPointの特徴を学習する 4. Max PoolでGroup全体のpointの特徴を抽出 5. 何回か1~4をくり返す

F-FPS(Feature-Farthest Point Samping)とD-FPS(Distance-Farthest Point Samping)の組み合わせPoint選択

![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/482094/488ae53f-7e5f-adc7-d1ab-a8bb509baa9b.png) F-FPS)はSemantic infomationの距離でpointをサンプリングするので、車であれば窓やタイヤフレームなどは3次元空間的には近くにあるが、Semantic infomation的には遠いので多様なpointがsamplingできる。しかし、同じ種類の違う物体(ex 2人の人)のFeatureが近くなってしまい、F-FPSだけだと片方のpointしかsamplingされない可能性がある。そこでD-FPSと組み合わせることで、Euclidian Disatanceで別の物と認識する事が出来る。

Candidate Generation Layer

F-FPSでsamplingしたpointの特徴量から物体の中心のshift(x,y,z)を推定する。
F-FPSでsamplingしたpointのXYZをshiftさせる
D-FPSでsamplingしたpointとF-FPSでsamplingしたpointを、2でshiftさせたpointから距離の近い点にGroupingする。(ここでGroupがNm/2個出来る)
MLPでGroupの特徴を学習する
Max PoolでGroup全体のpointの特徴を抽出(Nm/2個のGroupがそれぞれCm個の特徴量を持つ)

Anchor-free Regression Head

Anchor-baseの手法はいくつもAnchorを定義しないといけなく、計算量が多くなるからAnchor Freeにしたらしい。

Prediction Head

1. それぞれのGroupからBounding BoxとClassを直接推定する(Nm/2個の物体が出力)

結果

![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/482094/6f8f583c-8e48-14a9-23fc-e380dd941558.png) 1stage detectorの中では一番精度が高い！

Point-baseの方法の中では速い。

結論

・Pointをsampling&groupingしてBounding Boxの候補を提案するというアイディア方法って面白い

参考文献

3DSSD: Point-based 3D Single Stage Object Detector https://arxiv.org/pdf/2002.10187.pdf

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up