4.6 Article

High-speed object detection with a single-photon time-of-flight image sensor

期刊

OPTICS EXPRESS
卷 29, 期 21, 页码 33184-33196

出版社

Optica Publishing Group
DOI: 10.1364/OE.435619

关键词

-

类别

资金

  1. Defence Science and Technology Laboratory [DSTLX1000147844]
  2. Royal Academy of Engineering [RF/201718/17128]
  3. Engineering and Physical Sciences Research Council [EP/M01326X/1, EP/S001638/1]
  4. EPSRC [EP/M01326X/1, EP/S001638/1] Funding Source: UKRI

向作者/读者索取更多资源

3D time-of-flight (ToF) imaging is widely used in various fields such as augmented reality (AR), computer interfaces, robotics, and autonomous systems. Single-photon avalanche diodes (SPADs) play a critical role in providing accurate depth data over long distances. By employing convolutional neural networks (CNNs) for high-performance object detection, the limitations of small array sizes and limited lateral resolution can be overcome, enabling the extraction of more information from the image. Outdoor results from a portable SPAD camera system demonstrate the advantages of providing the CNN with full histogram data for object detection, with GPU-accelerated processing leading to fast overall latency suitable for safety-critical computer vision applications.
3D time-of-flight (ToF) imaging is used in a variety of applications such as augmented reality (AR), computer interfaces, robotics and autonomous systems. Single-photon avalanche diodes (SPADs) are one of the enabling technologies providing accurate depth data even over long ranges. By developing SPADs in array format with integrated processing combined with pulsed, flood-type illumination, high-speed 3D capture is possible. However, array sizes tend to be relatively small, limiting the lateral resolution of the resulting depth maps and, consequently, the information that can be extracted from the image for applications such as object detection. In this paper, we demonstrate that these limitations can be overcome through the use of convolutional neural networks (CNNs) for high-performance object detection. We present outdoor results from a portable SPAD camera system that outputs 16-bin photon timing histograms with 64x32 spatial resolution, with each histogram containing thousands of photons. The results, obtained with exposure times down to 2 ms (equivalent to 500 FPS) and in signal-to-background (SBR) ratios as low as 0.05, point to the advantages of providing the CNN with full histogram data rather than point clouds alone. Alternatively, a combination of point cloud and active intensity data may be used as input, for a similar level of performance. In either case, the GPU-accelerated processing time is less than 1 ms per frame, leading to an overall latency (image acquisition plus processing) in the millisecond range, making the results relevant for safety-critical computer vision applications which would benefit from faster than human reaction times. Published by The Optical Society under the terms of the Creative Commons Attribution 4.0 License.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据