期刊
INFORMATION SCIENCES
卷 476, 期 -, 页码 147-158出版社
ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2018.09.040
关键词
3D object detection; RGB-D data; Deep neural networks; Multi-modal region proposal networks; Deep feature learning
资金
- National Key R&D Program of China [2017YFB1002203]
- National Natural Science Foundation of China [61671426, 61731022, 61471150, 61572077]
- Instrument Developing Project of the Chinese Academy of Sciences [YZ201670]
- Beijing Natural Science Foundation [4182071]
- CAS-TWAS Presidents Fellowship [2015CTF075]
3D object detection in RGB-D images is a vast growing research area in computer vision. In this paper, we study the problems of amodal 3D object detection in RGB-D images and present an efficient 3D object detection system that can predict object location, size, and orientation. Unlike existing methods that either uses multistage point cloud processing or pre-computed segmentation mask to generate the 3D bounding boxes, we only leverage 2D region proposals for this task. Given a pair of color and depth image as input, we first predict 2D region proposals from the designed multimodal fusion region proposal networks and then we propose an efficient method to generate 3D bounding boxes from those region proposals by scaling down the 2D bounding boxes with a scale factor and project it to 3D space. We evaluate our system on challenging NYUv2 and SUN RGB-D dataset and compare with the state-of-the-art detection methods. The experimental results show that our method outperforms the state-of-the-art by a remarkable margin with faster detection time. We achieve the best results on the NYUv2 dataset on a 19-class object detection task while performing comparably faster detection performances on the SUN RGB-D dataset on a 10-class object detection task. (C) 2018 Published by Elsevier Inc.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据