Article
Environmental Sciences
Jing Zhang, Da Xu, Yunsong Li, Liping Zhao, Rui Su
Summary: In this paper, a one-stage end-to-end network called FusionPillars is proposed to fuse multisensor data, including LiDAR point cloud and camera images. FusionPillars includes three branches: a point-based branch, a voxel-based branch, and an image-based branch. Experimental results revealed that, compared to existing one-stage fusion networks, FusionPillars yield superior performance, with a considerable improvement in the detection precision for small objects.
Article
Construction & Building Technology
Shweta Dabetwar, Nitin Nagesh Kulkarni, Marco Angelosanti, Christopher Niezrecki, Alessandro Sabato
Summary: This study proposes a standardized approach using infrared thermography (IRT) in conjunction with unmanned aerial vehicles (UAVs) and point cloud reconstruction to generate three-dimensional (3D) models that can detect heat loss in buildings. The study also investigates the impact of image acquisition parameters on the reconstruction accuracy and provides data on the sensitivity and accuracy of the infrared-based point clouds (IR-PCs). The research defines a standard for using only IR images to reconstruct 3D models and has implications for structural assessment and energy efficiency evaluations.
JOURNAL OF BUILDING ENGINEERING
(2022)
Article
Computer Science, Artificial Intelligence
Qian Yu, Chengzhuan Yang, Hui Wei
Summary: This paper proposes a Part-Wise AtlasNet method based on the architecture of AtlasNet, which imposes constraints on the local structures of 3D objects by restricting each neural network to reconstructing a specific part of the object. The experimental results demonstrate that the proposed method generates structured point clouds with higher visual quality and better performance in 3D point cloud generation from a single image compared to other methods.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Computer Science, Software Engineering
Sneha Paul, Zachary Patterson, Nizar Bouguila
Summary: This paper presents DualMLP, a novel 3D model that introduces the concept of a two-stream network to handle the trade-off between the number of points and the computational overhead in existing 3D models. By using a small number of points in one branch and a larger number of points in the other branch, DualMLP achieves improved scene understanding while maintaining computational efficiency.
Article
Computer Science, Artificial Intelligence
Anny Yuniarti, Agus Zainal Arifin, Nanik Suciati
Summary: Learning-based methods for 3D reconstruction have gained attention due to their performance in image segmentation and classification. This paper introduces a new framework for 3D reconstruction of 2D images using a 3D template-based point generation network, which shows better performance than existing methods in terms of Chamfer distance on the ShapeNet dataset.
APPLIED SOFT COMPUTING
(2021)
Article
Computer Science, Artificial Intelligence
Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Xiaojun Tong, Wenxiu Sun
Summary: This paper proposes a new deep learning framework to infer the 3D shape of an object from a pair of stereo images, achieving better performance than state-of-the-art methods. Additionally, a large-scale synthetic benchmarking dataset named StereoShapeNet is introduced to evaluate the reconstruction algorithms.
Article
Construction & Building Technology
Mojtaba Noghabaei, Yajie Liu, Kevin Han
Summary: This paper presents a general compatibility analysis method for detecting construction incompatibilities in modular construction using reality capture technologies. The proposed method involves scanning the modules in manufacturing plant and construction site, and checking module-to-module compatibility remotely prior to shipment and installation, aiming to avoid project delays and reworks.
AUTOMATION IN CONSTRUCTION
(2022)
Article
Computer Science, Artificial Intelligence
Luis Roldao, Raoul de Charette, Anne Verroust-Blondet
Summary: This paper surveys the progress of semantic scene completion (SSC), highlighting the unresolved challenges and evaluating the performance of state-of-the-art techniques on popular datasets.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2022)
Article
Computer Science, Artificial Intelligence
Long Xi, Wen Tang, Tao Xue, TaoRuan Wan
Summary: This paper proposes a novel unsupervised deep learning network, named Binary Tree Network (BTreeNet), which effectively aligns partial and noisy 3D point clouds without training. By separating the learning of rotation and translation features, BTreeNet achieves remarkable generalization and robustness to unseen large and dense scenes. Furthermore, the Iterative Binary Tree Network (IBTreeNet) is introduced to continuously improve registration accuracy for such scenarios.
Article
Computer Science, Software Engineering
Yadong Wang, Teng Ran, Yuan Liang, Guoquan Zheng
Summary: This study aims to improve the quality of 3D reconstruction by addressing the lack of robustness in learning-based methods for depth estimation. A attention-based deep sparse priori cascade multi-view stereo network, ADS-MVSNet, is proposed. It utilizes a feature extraction module based on the attention mechanism and a depth sparse prior strategy module to accurately estimate the depth map and refine it using a coarse-to-fine method for better point cloud reconstruction.
COMPUTERS & GRAPHICS-UK
(2023)
Article
Environmental Sciences
Haihan Zhang, Chun Xie, Hisatoshi Toriya, Hidehiko Shishido, Itaru Kitahara
Summary: This study proposes a system that optimizes vehicle visual positioning by utilizing low-precision city-scale 3D scene maps reconstructed by unmanned aerial vehicles (UAVs). By employing a wall complementarity algorithm and a 3D-to-3D feature registration algorithm, the optimized city-scale 3D scene is combined with the local scene generated by an onboard stereo camera, resulting in improved vehicle localization accuracy. The experimental results show that utilizing a completed low-precision scene model enables achieving a vehicle localization accuracy with an average error of 3.91 m, which is close to the 3.27 m error obtained using the high-precision map, validating the effectiveness of the proposed algorithm. The system demonstrates the feasibility of utilizing low-precision city-scale 3D scene maps generated by UAVs for vehicle localization in large-scale scenes.
Article
Computer Science, Information Systems
Wei Liang, Pengfei Xu, Ling Guo, Heng Bai, Yang Zhou, Feng Chen
Summary: With the rapid development of science and technology, 3D object detection is becoming increasingly important in the field of computer vision. This paper mainly focuses on deep learning-based 3D object detection methods, compares the experimental results of different methods, and discusses future research directions.
MULTIMEDIA TOOLS AND APPLICATIONS
(2021)
Article
Engineering, Marine
Miguel Martin-Abadal, Manuel Pinar-Molina, Antoni Martorell-Torres, Gabriel Oliver-Codina, Yolanda Gonzalez-Cid
Summary: In recent years, the usage of Autonomous Underwater Vehicles (AUVs) has significantly reduced the workload and risks of interventions in underwater scenarios. This paper proposes the usage of a deep neural network to recognize pipes and valves in multiple underwater scenarios, achieving high recognition accuracy with PointNet neural network in underwater environments.
JOURNAL OF MARINE SCIENCE AND ENGINEERING
(2021)
Article
Computer Science, Artificial Intelligence
Biao Liu, Bihao Tian, Hengyang Wang, Junchao Qiao, Zhi Wang
Summary: This paper proposes two modules to improve the performance of 3D object detection. The first module reduces data loss by extracting more detailed initial voxel information and fully fusing context information. The second module extracts voxel features using a backbone neural network based on 3D sparse convolution and generates high-quality 3D proposal regions by a cross-connected region proposal network. Additionally, this paper extends the target generation strategy in the anchor-based algorithm, stabilizing the network performance for multiple objects.
NEURAL PROCESSING LETTERS
(2022)
Article
Construction & Building Technology
Xiao Pan, T. Y. Yang
Summary: This paper proposes an autonomous bolt loosening assessment method based on 3D vision. By creating a 3D point cloud of bolted connection using readily available 2D images and using a convolutional neural network (CNN) to recognize and quantify bolt loosening, the accurate localization and measurement of bolt loosening are achieved. The experimental results demonstrate that the proposed method can effectively localize and quantify bolt loosening with high accuracy and low cost.
JOURNAL OF BUILDING ENGINEERING
(2023)
Article
Computer Science, Artificial Intelligence
Hainan Cui, Tianxin Shi, Jun Zhang, Pengfei Xu, Yiping Meng, Shuhan Shen
Summary: The paper proposes an incremental framework for constructing a view-graph to improve its accuracy and robustness. Experimental results show that the new view-graph provides a better foundation for conventional SfM systems compared to many state-of-the-art methods.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Xiang Gao, Lingjie Zhu, Zexiao Xie, Hongmin Liu, Shuhan Shen
Summary: The paper introduces a simple yet effective rotation averaging pipeline called Incremental Rotation Averaging (IRA), inspired by incremental Structure from Motion techniques. By estimating absolute rotations incrementally, IRA is robust to outliers and achieves accurate rotation averaging results. Key techniques such as initial triplet selection, Weighted Local/Global Optimization, and Re-Rotation Averaging further improve the results.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2021)
Article
Engineering, Electrical & Electronic
Xiang Gao, Lingjie Zhu, Hainan Cui, Zexiao Xie, Shuhan Shen
Summary: This work presents an upgraded version of Incremental Rotation Averaging (IRA) called IRA++, which improves the scalability in both accuracy and efficiency by dividing the original Epipolar-geometry Graph (EG) into sub-graphs and performing distributed inner-rotation averaging. The evaluation results on various datasets demonstrate that IRA++ outperforms IRA and other state-of-the-art rotation averaging methods in terms of accuracy and efficiency, especially in large-scale and noise-polluted situations.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Engineering, Electrical & Electronic
Mengqi Rong, Hainan Cui, Zhanyi Hu, Hanqing Jiang, Hongmin Liu, Shuhan Shen
Summary: In this paper, an active learning based 3D semantic labeling method is proposed to generate accurate 3D semantic mesh models by integrating 2D semantic segmentation results and 3D mesh models. Through the iterative process of training, fusion, and selection, the labeling quality is improved while reducing the amount of annotation needed.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Engineering, Electrical & Electronic
Xiang Gao, Lingjie Zhu, Bin Fan, Hongmin Liu, Shuhan Shen
Summary: This paper proposes a simple yet effective translation averaging pipeline, Incremental Translation Averaging (ITA), which overcomes limitations in accuracy, robustness, simplicity, and efficiency present in traditional translation averaging methods. ITA computes camera locations incrementally, leading to higher accuracy and robustness, and is robust to measurement outliers and accurate in parameter estimation while being simple and efficient. Comprehensive evaluations on the 1DSfM dataset demonstrate the effectiveness of ITA and its superiority over state-of-the-art translation averaging approaches.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Geography, Physical
Jiali Han, Yuzhou Liu, Mengqi Rong, Xianwei Zheng, Shuhan Shen
Summary: This paper proposes a method called FloorUSG for multistage floorplan reconstruction from RGB images and dense 3D mesh. It combines 2D plane instances and 3D plane primitives and accurately recovers the location of the floorplan through 2D-3D primitive fusion. Experimental results show that the method can recover detailed structures of scenes of different scales and reconstruct the floorplan with high robustness.
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING
(2023)
Article
Engineering, Electrical & Electronic
Xiang Gao, Hainan Cui, Menghan Li, Zexiao Xie, Shuhan Shen
Summary: We propose IRAv3, which is built upon the state-of-the-art rotation averaging method IRA++, to advance the fundamental task in 3D computer vision. The key observation is that the community detection-based Epipolar-geometry Graph (EG) clustering in IRA++ is fixed and does not allow changes, limiting the accuracy of absolute rotation estimation. However, in IRAv3, the EG clustering is performed together with the estimation of absolute rotation for each cluster, allowing for dynamic determination of vertex affiliation. Experimental results on 1DSfM and KITTI odometry datasets demonstrate the effectiveness of IRAv3 in large-scale rotation averaging problems and its advantages over previous works and other state-of-the-art methods.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2023)
Article
Engineering, Civil
Xiang Gao, Dongdong Tao, Yuqian Liu, Zexiao Xie, Shuhan Shen
Summary: Large-scale urban scene 3D mapping requires sensor pose globalization, and integrating street-view images and vehicle-borne LiDAR points can facilitate this task. Existing methods often assume strict synchronization and exact calibration between cameras and LiDARs, which are difficult to guarantee in practice. To address this, we propose a pipeline for temporal and spatial pose globalization using GNSS/IMU guidance, loosening the assumptions. Our method initializes and refines global poses using multi-sensor pre-calibrations, and performs global optimization with image-based, LiDAR-based, and cross-domain data association and constraint construction. Experimental results on self-collected and KITTI Odometry datasets demonstrate the effectiveness of our method for multi-sensor pose globalization in large-scale urban scene 3D mapping.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Mengqi Rong, Hainan Cui, Shuhan Shen
Summary: Inspired by Active Learning and 2D-3D semantic fusion, our proposed framework utilizes rendered 2D images to achieve efficient semantic segmentation of large-scale 3D scenes with only a few 2D image annotations. By rendering perspective images in the 3D scene and fine-tuning a pre-trained network for segmentation, we can project and fuse dense predictions onto the 3D model. Through an iterative process of rendering-segmentation-fusion, difficult-to-segment image samples can be generated without complex 3D annotations, resulting in label-efficient 3D scene segmentation. Experimental results on three large-scale datasets demonstrate the effectiveness of our method compared to state-of-the-art approaches.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2023)
Proceedings Paper
Automation & Control Systems
Diantao Tu, Baoyu Wang, Hainan Cui, Yuqian Liu, Shuhan Shen
Summary: This paper proposes a novel calibration pipeline that can automatically calibrate multiple cameras and LiDARs in a Structure-from-Motion (SfM) process, eliminating the need for manual design of calibration objects.
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Hainan Cui, Shuhan Shen
Summary: This paper presents a tailor-made multi-camera based motion averaging system that utilizes fixed relative poses to improve the accuracy and robustness of SfM. The algorithm achieves superior accuracy and robustness compared to state-of-the-art methods, as demonstrated by experiments on various datasets.
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE
(2022)
Article
Remote Sensing
Youli Ding, Xianwei Zheng, Yiping Chen, Shuhan Shen, Hanjiang Xiong
Summary: In this paper, a dense context distillation network (DCDNet) is proposed for semantic segmentation of oblique unmanned aerial vehicle (UAV) images. DCDNet effectively learns distortion-robust feature representation by densely and selectively gathering useful context from dual-scale feature maps. It also incorporates joint supervision and multi-scale feature aggregation for better learning and prediction, achieving a state-of-the-art segmentation performance on the challenging UAVid dataset with a mIoU score of 72.38%.
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION
(2022)
Article
Computer Science, Artificial Intelligence
Zexiao Xie, Xiaoxuan Yu, Xiang Gao, Kunqian Li, Shuhan Shen
Summary: Depth completion is the task of recovering pixelwise depth from incomplete and noisy depth measurements, and it is important for various computer vision applications. Traditional image processing techniques were used in the past, but deep learning methods, especially for LiDAR-image-based depth completion, have become increasingly popular. This article reviews the related works and discusses future research directions for depth completion.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Hainan Cui, Diantao Tu, Fulin Tang, Pengfei Xu, Hongmin Liu, Shuhan Shen
Summary: With the popularization of smartphones, more high-quality videos are available, leading to an increase in the scale of scene reconstruction. A tailor-made framework is proposed to solve the problems caused by high-resolution and high frame rate videos, aiming to achieve accurate and robust structure-from-motion based on monocular videos. The key ideas include utilizing the spatial and temporal continuity of video sequences for improved reconstruction accuracy and robustness, as well as leveraging the redundancy of video sequences to enhance efficiency and scalability. The system is able to integrate data from different video sequences for simultaneous reconstruction.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2022)
Proceedings Paper
Automation & Control Systems
Mingzhe Lv, Diantao Tu, Xincheng Tang, Yuqian Liu, Shuhan Shen
Summary: The paper proposes a novel semantically guided Multi-View Stereo method for dense 3D road mapping, which integrates semantic information into the process to improve completeness and handle holes and outliers in low-textured areas. Experimental results demonstrate that the method achieves superior completeness with comparable accuracy for 3D road mapping compared to state-of-the-art MVS methods.
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021)
(2021)