4.7 Article

Real-Time Dense Monocular SLAM With Online Adapted Depth Prediction Network

期刊

IEEE TRANSACTIONS ON MULTIMEDIA
卷 21, 期 2, 页码 470-483

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2018.2859034

关键词

Monocular SLAM; dense mapping; convolutional neural network; fusion; online tuning

资金

  1. National Natural Science Foundation of China [61502188]
  2. Wuhan Science and Technology Bureau [2017010201010111]
  3. Program for HUST Acadamic Frontier Youth Team

向作者/读者索取更多资源

Considerable advances have been achieved in estimating the depth map from a single image via convolutional neural networks (CNNs) during the past few years. Combining depth prediction from CNNs with conventional monocular simultaneous localization and mapping (SLAM) is promising for accurate and dense monocular reconstruction, in particular addressing the two long-standing challenges in conventional monocular SLAM: low map completeness and scale ambiguity. However, depth estimated by pretrained CNNs usually fails to achieve sufficient accuracy for environments of different types from the training data, which are common for certain applications such as obstacle avoidance of drones in unknown scenes. Additionally, inaccurate depth prediction of CNN could yield large tracking errors in monocular SLAM. In this paper, we present a real-time dense monocular SLAM system, which effectively fuses direct monocular SLAM with an online-adapted depth prediction network for achieving accurate depth prediction of scenes of different types from the training data and providing absolute scale information for tracking and mapping. Specifically, on one hand, tracking pose (i.e., translation and rotation) from direct SLAM is used for selecting a small set of highly effective and reliable training images, which acts as ground truth for tuning the depth prediction network on-the-fly toward better generalization ability for scenes of different types. A stage-wise Stochastic Gradient Descent algorithm with a selective update strategy is introduced for efficient convergence of the tuning process. On the other hand, the dense map produced by the adapted network is applied to address scale ambiguity of direct monocular SLAM which in turn improves the accuracy of both tracking and overall reconstruction. The system with assistance of both CPUs and GPUs, can achieve real-time performance with progressively improved reconstruction accuracy. Experimental results on public datasets and live application to obstacle avoidance of drones demonstrate that our method outperforms the state-of-the-art methods with greater map completeness and accuracy, and a smaller tracking error.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Nanoscience & Nanotechnology

Roadmap on emerging hardware and technology for machine learning

Karl Berggren, Qiangfei Xia, Konstantin K. Likharev, Dmitri B. Strukov, Hao Jiang, Thomas Mikolajick, Damien Querlioz, Martin Salinga, John R. Erickson, Shuang Pi, Feng Xiong, Peng Lin, Can Li, Yu Chen, Shisheng Xiong, Brian D. Hoskins, Matthew W. Daniels, Advait Madhavan, James A. Liddle, Jabez J. McClelland, Yuchao Yang, Jennifer Rupp, Stephen S. Nonnenmann, Kwang-Ting Cheng, Nanbo Gong, Miguel Angel Lastras-Montano, A. Alec Talin, Alberto Salleo, Bhavin J. Shastri, Thomas Ferreira de Lima, Paul Prucnal, Alexander N. Tait, Yichen Shen, Huaiyu Meng, Charles Roques-Carmes, Zengguang Cheng, Harish Bhaskaran, Deep Jariwala, Han Wang, Jeffrey M. Shainline, Kenneth Segall, J. Joshua Yang, Kaushik Roy, Suman Datta, Arijit Raychowdhury

Summary: Recent progress in artificial intelligence is primarily attributed to the rapid development of machine learning, but the performance and energy efficiency of hardware systems set fundamental limits on machine learning capabilities. Data-centric computing requires a revolution in hardware systems, with new hardware platforms offering hope for future computing with improved throughput and energy efficiency. However, challenges such as materials selection, device optimization, circuit fabrication, and system integration must be addressed in building such systems.

NANOTECHNOLOGY (2021)

Article Computer Science, Information Systems

Variation-Aware Federated Learning With Multi-Source Decentralized Medical Image Data

Zengqiang Yan, Jeffry Wicaksana, Zhiwei Wang, Xin Yang, Kwang-Ting Cheng

Summary: The paper introduces a variation-aware federated learning (VAFL) framework to address the cross-client variation problem in medical image data by minimizing variations among clients while preserving privacy, used for automated classification of clinically significant prostate cancer.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2021)

Review Materials Science, Multidisciplinary

The 2021 flexible and printed electronics roadmap

Yvan Bonnassieux, Christoph J. Brabec, Yong Cao, Tricia Breen Carmichael, Michael L. Chabinyc, Kwang-Ting Cheng, Gyoujin Cho, Anjung Chung, Corie L. Cobb, Andreas Distler, Hans-Joachim Egelhaaf, Gerd Grau, Xiaojun Guo, Ghazaleh Haghiashtiani, Tsung-Ching Huang, Muhammad M. Hussain, Benjamin Iniguez, Taik-Min Lee, Ling Li, Yuguang Ma, Dongge Ma, Michael C. McAlpine, Tse Nga Ng, Ronald osterbacka, Shrayesh N. Patel, Junbiao Peng, Huisheng Peng, Jonathan Rivnay, Leilai Shao, Daniel Steingart, Robert A. Street, Vivek Subramanian, Luisa Torsi, Yunyun Wu

Summary: This roadmap presents perspectives and visions from leading researchers in the fields of flexible and printable electronics, covering device technologies, fabrication techniques, and design and modeling approaches essential for future development of new applications leveraging flexible electronics (FE). It aims to serve as a resource on the current status and future challenges, highlighting the broad opportunities made available by FE technologies.

FLEXIBLE AND PRINTED ELECTRONICS (2021)

Article Computer Science, Hardware & Architecture

R2F: A Remote Retraining Framework for AIoT Processors With Computing Errors

Dawen Xu, Meng He, Cheng Liu, Ying Wang, Long Cheng, Huawei Li, Xiaowei Li, Kwang-Ting Cheng

Summary: AIoT processors fabricated with newer technology nodes are susceptible to rising soft errors, especially in deep learning accelerators. To address this issue, a remote retraining framework and an optimized partial triple modular redundancy strategy are proposed. The experiments show that this approach allows for tradeoffs between model accuracy and performance penalty, while a data transmission optimization method reduces retraining time significantly.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS (2021)

Article Computer Science, Interdisciplinary Applications

Adaptive Contrast for Image Regression in Computer-Aided Disease Assessment

Weihang Dai, Xiaomeng Li, Wan Hang Keith Chiu, Michael D. Kuo, Kwang-Ting Cheng

Summary: This paper proposes AdaCon, a contrastive learning framework for deep image regression, which incorporates a novel adaptive-margin contrastive loss and a regression prediction branch for feature learning. By considering label distance relationships in feature representations, AdaCon achieves better performance in downstream regression tasks. Experimental results on two medical image regression tasks demonstrate the effectiveness of AdaCon, with relative improvements of 3.3% and 5.9% in MAE compared to state-of-the-art methods for BMD estimation and LVEF prediction, respectively.

IEEE TRANSACTIONS ON MEDICAL IMAGING (2022)

Article Computer Science, Artificial Intelligence

Imitation Learning-Based Algorithm for Drone Cinematography System

Yuanjie Dang, Chong Huang, Peng Chen, Ronghua Liang, Xin Yang, Kwang-Ting Cheng

Summary: This study proposes an integrated aerial filming system that autonomously captures cinematic shots of action scenes by imitating demonstrations. The system utilizes the deep deterministic policy gradient to build a model and designs a spatial attention network to selectively focus on discriminative joints of the skeleton. Experimental results demonstrate that our method successfully mimics viewpoint selection strategy and captures more accurate viewpoints compared to existing techniques.

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2022)

Article Computer Science, Hardware & Architecture

HyCA: A Hybrid Computing Architecture for Fault-Tolerant Deep Learning

Cheng Liu, Cheng Chu, Dawen Xu, Ying Wang, Qianlong Wang, Huawei Li, Xiaowei Li, Kwang-Ting Cheng

Summary: This paper proposes a hybrid computing architecture for fault-tolerant DLAs, which shows significantly higher reliability, scalability, and performance with less chip area penalty compared to conventional redundancy approaches. By taking advantage of flexible recomputing, it can also be used to scan the entire 2-D computing array and effectively detect faulty PEs at runtime.

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS (2022)

Article Computer Science, Information Systems

Customized Federated Learning for Multi-Source Decentralized Medical Image Classification

Jeffry Wicaksana, Zengqiang Yan, Xin Yang, Yang Liu, Lixin Fan, Kwang-Ting Cheng

Summary: The performance of deep networks for medical image analysis is often limited by the scarcity of medical data and privacy concerns. To address this issue, we propose CusFL, a customized federated learning approach that enables each client to train a client-specific model based on a federated global model. By using a federated feature extractor for guidance, CusFL allows clients to selectively learn useful knowledge from the federated model and improve their personalized models.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space

Arnav Chavan, Zhiqiang Shen, Zhuang Liu, Zechun Liu, Kwang-Ting Cheng, Eric Xing

Summary: This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework. The proposed method achieves high compression rates and accuracy improvements on various vision transformers through an end-to-end searching process across multiple dimensions.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

InsMix: Towards Realistic Generative Data Augmentation for Nuclei Instance Segmentation

Yi Lin, Zeyu Wang, Kwang-Ting Cheng, Hao Chen

Summary: This paper proposes a realistic data augmentation method called InsMix for nuclei segmentation. The method utilizes morphology constraints and background perturbation to enhance the images, enabling rich information acquisition about the nuclei while preserving their morphology characteristics. Experimental results demonstrate the superior performance of the proposed method on two datasets.

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT II (2022)

Proceedings Paper Neuroimaging

Dual-Distribution Discrepancy for Anomaly Detection in Chest X-Rays

Yu Cai, Hao Chen, Xin Yang, Yu Zhou, Kwang-Ting Cheng

Summary: This paper proposes a new method for anomaly detection using both known normal images and unlabeled images, achieving significant improvements on three CXR datasets.

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III (2022)

Article Computer Science, Artificial Intelligence

One-Shot Imitation Drone Filming of Human Motion Videos

Chong Huang, Yuanjie Dang, Peng Chen, Xin Yang, Kwang-Ting Cheng

Summary: Imitation learning is applied to autonomous camera systems, but current methods require a large number of training videos with similar styles and struggle to generalize to different styles. To address this, a framework called one-shot imitation filming is proposed, which can imitate a filming style without style-specific model training using two key techniques: filming style feature extraction and camera motion prediction.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Towards Robust Dual-View Transformation via Densifying Sparse Supervision for Mammography Lesion Matching

Junlin Xian, Zhiwei Wang, Kwang-Ting Cheng, Xin Yang

Summary: A holistic understanding of dual-view transformation is important for computer-aided diagnosis of breast lesions, and densifying sparse supervision by synthesizing lesions across two views can lead to superior performance in cross-view lesion matching.

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V (2021)

Article Computer Science, Artificial Intelligence

Joint Multi-Dimension Pruning via Numerical Gradient Update

Zechun Liu, Xiangyu Zhang, Zhiqiang Shen, Yichen Wei, Kwang-Ting Cheng, Jian Sun

Summary: This study presents a joint multi-dimension pruning method, effectively pruning a network on three crucial aspects simultaneously. By defining the pruning vector, constructing a mapping from the vector to the pruned network structure, and optimizing the vector through numerical gradient optimization, the method collaboratively optimizes across dimensions and achieves better performance.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Article Engineering, Civil

Fast Depth Prediction and Obstacle Avoidance on a Monocular Drone Using Probabilistic Convolutional Neural Network

Xin Yang, Jingyu Chen, Yuanjie Dang, Hongcheng Luo, Yuesheng Tang, Chunyuan Liao, Peng Chen, Kwang-Ting Cheng

Summary: This paper presents a real-time onboard approach for monocular depth prediction and obstacle avoidance with a lightweight probabilistic CNN, which efficiently predicts depth and confidence, generates traversable waypoints, and produces control inputs for drones. Experimental results demonstrate that the method runs faster than state-of-the-art approaches and achieves better depth estimation accuracy, showing superiority in obstacle avoidance in simulated and real environments.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2021)

暂无数据