More than 5 years have passed since last update.

Real-time 3D Scene Layout from a Single Image Using Convolutional Neural Networks

ニューラルネットワーク

Posted at 2019-03-19

またまたSLAMのお話。単眼カメラから3次元のレイアウトを把握するためのもので、人間が歩いているときに物体を認識するようなことを人間が行うの同様にリアルタイムで（歩きながら）ロボットでも行いたい場合などに役立ちます。人間の目は左右に一つずつ合計2つありますが一つの単眼カメラから同様のことをします。

Abstract

We consider the problem of understanding the
3D layout of indoor corridor scenes from a single image in
real time. Identifying obstacles such as walls is essential for
robot navigation, but also challenging due to the diversity in
structure, appearance and illumination of real-world corridor
scenes. Many current single-image methods make Manhattanworld
assumptions, and break down in environments that do
not meet this mold. They also may require complicated handdesigned
features for image segmentation or clear boundaries
to form certain building models. In addition, most cannot run
in real time.
In this paper, we propose to combine machine learning with
geometric modelling to build a simplified 3D model from a
single image. We first employ a supervised Convolutional Neural
Network (CNN) to provide a dense, but coarse, geometric class
labelling of the scene. We then refine this labelling with a fully
connected Conditional Random Field (CRF). Finally, we fit line
segments along wall-ground boundaries and “pop up” a 3D
model using geometric constraints.
We assemble a dataset of 967 labelled corridor images. Our
experiments on this dataset and another publicly available
dataset show our method outperforms other single image scene
understanding methods in pixelwise accuracy while labelling
images at over 15 Hz.

私たちはリアルタイムで単眼カメラの画像から屋内廊下シーンの3Dレイアウトを把握する問題を考えます。壁のような障害物を特定することはロボットナビゲーションについて不可欠ですが、実際の廊下の構造、外観、照明の多様性のために困難です。現在の多くの単眼カメラによる方法はマンハッタンワールド仮説を提示し、この型に合わない環境ではうまくいきません。またその手法は、特定の建物モデルを形成するために、画像のセグメンテーションや境界を明確にするために複雑なハンドメイドの機能を必要とすることもあります。さらにほとんどの場合リアルタイムで実行できません。
本論文では、機械学習と幾何学モデルを組み合わせて、単眼カメラによる画像から単純化された3Dモデルを構築することを提案します。はじめにCNNを用いて粗い密度の幾何学的なクラスラベルを提供します。次に、完全に接続されたCRFでこのラベルを改良します。最後に、壁の境界線に沿って線を合わせ幾何学的制約を使用して3Dモデルをポップアップします。私たちは、967個のラベリングされた廊下画像のデータセットを組み立てます。このデータセットと他のパブリックな入手可能なデータセットに対する私たちの実験は、私たちの方法のほうが他の単眼カメラの画像による方法よりも優れていることを示している。

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7487368

https://www.youtube.com/watch?v=2CvFHy5jk1c

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up