4.7 Article

Real-Time 3D Single Object Tracking With Transformer

期刊

IEEE TRANSACTIONS ON MULTIMEDIA
卷 25, 期 -, 页码 2339-2353

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2022.3146714

关键词

3D single object tracking; lidar point-cloud; siamese network; transformer; self attention

向作者/读者索取更多资源

LiDAR-based 3D single object tracking is challenging due to the sparsity and occlusion of point clouds at long distance. To address this, we propose the Point-Track-Transformer (PTT) module, which utilizes the Transformer architecture to compute attention weights and generate fine-tuned attention features. By embedding PTT into the tracking pipeline, our PTT-Net achieves state-of-the-art performance on KITTI and NuScenes datasets, surpassing the baseline by around 10% in the Car category.
LiDAR-based 3D single object tracking is a challenging issue in robotics and autonomous driving. Currently, existing approaches usually suffer from the problem that objects at long distance often have very sparse or partially-occluded point clouds, which makes the features extracted by the model ambiguous. Ambiguous features will make it hard to locate the target object and finally lead to bad tracking results. To solve this problem, we utilize the powerful Transformer architecture and propose a Point-Track-Transformer (PTT) module for point cloud-based 3D single object tracking task. Specifically, PTT module generates fine-tuned attention features by computing attention weights, which guides the tracker focusing on the important features of the target and improves the tracking ability in complex scenarios. To evaluate our PTT module, we embed PTT into the dominant method and construct a novel 3D SOT tracker named PTT-Net. In PTT-Net, we embed PTT into the voting stage and proposal generation stage, respectively. PTT module in the voting stage could model the interactions among point patches, which learns context-dependent features. Meanwhile, PTT module in the proposal generation stage could capture the contextual information between object and background. We evaluate our PTT-Net on KITTI and NuScenes datasets. Experimental results demonstrate the effectiveness of PTT module and the superiority of PTT-Net, which surpasses the baseline by a noticeable margin, similar to 10% in the Car category. Meanwhile, our method also has a significant performance improvement in sparse scenarios. In general, the combination of transformer and tracking pipeline enables our PTT-Net to achieve state-of-the-art performance on both two datasets. Additionally, PTT-Net could run in real-time at 40FPS on NVIDIA 1080Ti GPU. Our code is open-sourced for the research community at https://github.com/shanjiayao/PTT.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Chemistry, Analytical

Point Siamese Network for Person Tracking Using 3D Point Clouds

Yubo Cui, Zheng Fang, Sifan Zhou

SENSORS (2020)

Article Engineering, Electrical & Electronic

3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object Tracking Using Raw Point Cloud

Zheng Fang, Sifan Zhou, Yubo Cui, Sebastian Scherer

Summary: This paper introduces a 3D tracking method called 3D-SiamRPN Network, which tracks a single target object using raw 3D point cloud data. Experimental results show its competitive performance in both Success and Precision, as well as real-time running capabilities.

IEEE SENSORS JOURNAL (2021)

Article Robotics

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

Yubo Cui, Jiayao Shan, Zuoxu Gu, Zhiheng Li, Zheng Fang

Summary: This paper proposes a sparse-to-dense and transformer-based framework for 3D single object tracking. By transforming sparse points into pillar structures and compressing them into 2D features, a dense representation is obtained. The framework utilizes attention-based computation for global similarity and multi-scale feature compensation. The object tracking is achieved through a two-stage decoder.

IEEE ROBOTICS AND AUTOMATION LETTERS (2022)

Article Engineering, Electrical & Electronic

A 40nm 2TOPS/W Depth-Completion Neural Network Accelerator SoC With Efficient Depth Engine for Realtime LiDAR Systems

Miao Sun, Yingjie Cao, Jian Qian, Jie Li, Sifan Zhou, Ziyu Zhao, Yifan Wu, Tao Xia, Yajie Qin, Lei Qiu, Shunli Ma, Patrick Yin Chiang, Shenglong Zhuo

Summary: This paper presents a heterogeneous AI-accelerator SoC specifically designed for depth image completion computation. Three key innovations are introduced to enhance the SoC's performance, including a fully-filled dataflow management engine for preprocessing the RGB+Depth input, a hardware-tiling co-processor for improving the efficiency of the CNN accelerator, and the incorporation of a RISC-V core to better execute vector computations. The implemented SoC achieves 2TOPs/W energy efficiency and 34fps throughput under VGA-resolution output for real-time LiDAR systems.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS (2023)

Article Chemistry, Physical

Synergism of electronic structure regulation and interface engineering for boosting hydrogen evolution reaction on S-Scheme FeS2/ S-ZnSnO3 heterostructure

Sifan Zhou, Chunming Yang, Li Guo, Razium Ali Soomro, Maomao Niu, Zhixiong Yang, Rui Du, Danjun Wang, Feng Fu, Bin Xu

Summary: The construction of a FeS2/S-ZnSnO3 heterostructure was carried out to achieve efficient photocatalytic hydrogen evolution reaction (HER) activity. The band structure of ZnSnO3 was regulated by sulfur doping, and FeS2 nanoparticles were coupled to improve optical absorption and carrier separation/transfer in the composite. The optimized heterostructure (8.7%FeS2@S15%-ZSO) showed a HER performance of 2225 μmol g-1 h-1, which was significantly higher than ZSO, S15%-ZSO, and FeS2. DFT calculations confirmed that S doping regulated the electronic structure of S-ZSO, while the coupling of FeS2 constructed a S-Scheme heterostructure, leading to improved HER performance.

APPLIED SURFACE SCIENCE (2023)

暂无数据