4.3 Article

A survey on conventional and learning-based methods for multi-view stereo

Journal

PHOTOGRAMMETRIC RECORD
Volume -, Issue -, Pages -

Publisher

WILEY
DOI: 10.1111/phor.12456

Keywords

deep learning; dense reconstruction; depth estimation; MVS; PatchMatch; stereomatching

Ask authors/readers for more resources

3D reconstruction of scenes using multiple images has been extensively studied in recent years. Multi-view stereo algorithms aim to generate a dense 3D model of the scene, but achieving complete, accurate, and aesthetically pleasing representations remains a challenge. This work provides a survey on the most widely used multi-view stereo methods, discussing the underlying concepts and challenges, with a focus on close-range 3D reconstruction applications.
3D reconstruction of scenes using multiple images, relying on robust correspondence search and depth estimation, has been thoroughly studied for the two-view and multi-view scenarios in recent years. Multi-view stereo (MVS) algorithms aim to generate a rich, dense 3D model of the scene in the form of a dense point cloud or a triangulated mesh. In a typical MVS pipeline, the robust estimations for the camera poses along with the sparse points obtained from structure from motion (SfM) are used as input. During this process, the depth of generally every pixel of the scene is to be calculated. Several methods, either conventional or, more recently, learning-based have been developed for solving the correspondence search problem. A vast amount of research exists in the literature using local, global or semi-global stereomatching approaches, with the PatchMatch algorithm being among the most popular and efficient conventional ones in the last decade. Yet, and despite the widespread evolution of the algorithms, yielding complete, accurate and aesthetically pleasing 3D representations of a scene remains an open issue in real-world and large-scale photogrammetric applications. This work aims to provide a concrete survey on the most widely used MVS methods, investigating underlying concepts and challenges. To this end, the theoretical background and relative literature are discussed for both conventional and learning-based approaches, with a particular focus on close-range 3D reconstruction applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Energy & Fuels

Porosity Assessment in Geological Cores Using 3D Data

Paulina Kujawa, Krzysztof Chudy, Aleksandra Banasiewicz, Kacper Lesny, Radoslaw Zimroz, Fabio Remondino

Summary: The porosity of rocks is a crucial parameter in rock mechanics and underground mining, affecting fluid movement and internal processes. Conventional testing methods are complex, while modern technologies are expensive. In this study, a core sample with karst and porous structures was used, and resin was poured to reinforce it. The core was then cut and 3D optical scanning was conducted for porosity assessment, achieving accurate results at a reasonable cost.

ENERGIES (2023)

Article Geography, Physical

Multiple View Stereo with quadtree-guided priors

Elisavet Konstantina Stathopoulou, Roberto Battisti, Dan Cernea, Andreas Georgopoulos, Fabio Remondino

Summary: To support depth estimation in challenging surfaces scenarios, we propose an extended PatchMatch pipeline using an adaptive accumulated matching cost calculation. Our approach achieves competitive results compared to state-of-the-art methods by favoring the reconstruction of problematic regions while preserving fine details in rich textured regions.

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2023)

Article Geography, Physical

ENRICH: Multi-purposE dataset for beNchmaRking In Computer vision and pHotogrammetry

Davide Marelli, Luca Morelli, Elisa Mariarosaria Farella, Simone Bianco, Gianluigi Ciocca, Fabio Remondino

Summary: High-resolution data and accurate ground truth are crucial for evaluating and comparing methods and algorithms effectively. However, acquiring real data that is representative and diverse in a given application domain is often challenging. To address this issue, this paper introduces a new synthetic dataset called ENRICH for testing photogrammetric and computer vision algorithms. Compared to existing datasets, ENRICH provides higher resolution images with various lighting conditions, camera orientations, scales, and fields of view. ENRICH consists of three sub-datasets: ENRICH-Aerial, ENRICH-Square, and ENRICH-Statue, each showcasing different characteristics. The usefulness of this dataset is demonstrated through various photogrammetry and computer vision tasks, such as evaluating hand-crafted and deep learning-based features, examining the effects of ground control points (GCPs) configuration on 3D accuracy, and monocular depth estimation. ENRICH is publicly available at: https://github.com/davidemarelli/ENRICH.

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2023)

Article Environmental Sciences

Knowledge Enhanced Neural Networks for Point Cloud Semantic Segmentation

Eleonora Grilli, Alessandro Daniele, Maarten Bassier, Fabio Remondino, Luciano Serafini

Summary: Deep learning approaches have become state-of-the-art in domains such as pattern recognition and computer vision, but they require a large amount of training data, which is often a challenge in geospatial and remote sensing fields. Neuro-Symbolic Integration field provides a possible solution by incorporating background knowledge into the neural network's learning pipeline, with one method being KENN (Knowledge Enhanced Neural Networks). Empirical results demonstrate that using KENN for point cloud semantic segmentation tasks improves the performance of the original network and achieves state-of-the-art levels of accuracy.

REMOTE SENSING (2023)

Article Chemistry, Multidisciplinary

PhotoMatch: An Open-Source Tool for Multi-View and Multi-Modal Feature-Based Image Matching

Esteban Ruiz de Ona, Ines Barbero-Garcia, Diego Gonzalez-Aguilera, Fabio Remondino, Pablo Rodriguez-Gonzalvez, David Hernandez-Lopez

Summary: This article presents PhotoMatch, an open-source tool for multi-view and multi-modal feature-based image matching, including various state-of-the-art methods for preprocessing, feature extraction, and matching. The tool also provides tools for detailed assessment and comparison of different methods, allowing users to select the best combination of methods for each specific dataset.

APPLIED SCIENCES-BASEL (2023)

Article Environmental Sciences

A Critical Analysis of NeRF-Based 3D Reconstruction

Fabio Remondino, Ali Karami, Ziyang Yan, Gabriele Mazzacca, Simone Rigon, Rongjun Qin

Summary: This paper critically analyzes the use of neural radiance fields (NeRFs) for image-based 3D reconstruction and compares them quantitatively with traditional photogrammetry. The strengths and weaknesses of NeRFs are objectively evaluated, and their applicability to different real-life scenarios is discussed. The study compares various NeRF methods using objects with different sizes and surface characteristics, and evaluates the quality of the resulting 3D reconstructions based on multiple criteria. The results demonstrate the superior performance of NeRFs for non-collaborative objects with texture-less, reflective, and refractive surfaces, while photogrammetry outperforms NeRFs for objects with cooperative texture. The complementarity of these methods should be further explored in future research.

REMOTE SENSING (2023)

Article Remote Sensing

MIN3D Dataset: MultI-seNsor 3D Mapping with an Unmanned Ground Vehicle

Pawel Trybala, Jaroslaw Szrek, Fabio Remondino, Paulina Kujawa, Jacek Wodecki, Jan Blachowski, Radoslaw Zimroz

Summary: The research potential in the field of mobile mapping technologies is often hindered by constraints such as expensive hardware, limited access to target sites, and the collection of ground truth data. To address these challenges, the research community often provides open datasets. However, datasets that encompass demanding conditions with synchronized sensors are currently limited. To alleviate this issue, the MIN3D dataset is proposed, which includes data gathered using a wheeled mobile robot in two distinct locations. By sharing this dataset, the aim is to support the development of robust methods for navigation and mapping in challenging underground conditions.

PFG-JOURNAL OF PHOTOGRAMMETRY REMOTE SENSING AND GEOINFORMATION SCIENCE (2023)

Proceedings Paper Engineering, Electrical & Electronic

Vision and UWB-Based Collaborative Positioning Between Ground and UAS Platforms

Andrea Masiero, Charles Toth, Fabio Remondino

Summary: The rapid development of autonomous ground vehicle technologies and the proliferation of unmanned aerial system applications have raised the need for safe and effective navigation solutions. While GNSS has been widely used for civilian applications, its reception is unreliable in certain areas. Collaborative navigation offers a potential solution by sharing navigation information among platforms operating in close vicinity. This research investigates the feasibility and performance of collaborative navigation in areas where ground and airborne vehicles share the same space, and initial results of a field test are reported.

2023 IEEE/ION POSITION, LOCATION AND NAVIGATION SYMPOSIUM, PLANS (2023)

Proceedings Paper Geography, Physical

HANDCRAFTED AND LEARNING-BASED TIE POINT FEATURES - COMPARISON USING THE EUROSDR RPAS BENCHMARK DATASET

M. Peppa, L. Morelli, J. P. Mills, N. T. Penna, F. Remondino

Summary: Accurate and reliable image correspondences are crucial in photogrammetry. Recent research has shown promising results in using machine learning methods to extract tie points, but challenges remain in achieving rotationally invariant features and handling large format imagery.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

Proceedings Paper Geography, Physical

AERIAL TRIANGULATION WITH LEARNING-BASED TIE POINTS

F. Remondino, L. Morelli, E. Stathopoulou, M. Elhashash, R. Qin

Summary: This paper explores learning-based methods for extracting tie points in aerial image blocks and confirms the potential of these methods in finding reliable image correspondences in the aerial block.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

Proceedings Paper Geography, Physical

MONOCULAR DEPTH PREDICTION IN PHOTOGRAMMETRIC APPLICATIONS

M. Welponer, E. K. Stathopoulou, F. Remondino

Summary: Despite the recent success of learning-based monocular depth estimation algorithms, they still struggle to produce reliable results in the 3D space without additional scene cues. This study explores supervised CNN architectures for monocular depth estimation and evaluates their potential in 3D reconstruction, introducing a new benchmark for synthetic outdoor scenes.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

Proceedings Paper Geography, Physical

THE EUROSDR TIME BENCHMARK FOR HISTORICAL AERIAL IMAGES

E. M. Farella, L. Morelli, F. Remondino, J. P. Mills, N. Haala, J. Crompvoets

Summary: The article introduces the TIME benchmark, which aims to explore the potential of historical aerial images. The benchmark provides multiple historical aerial image datasets and ancillary data to support the photogrammetric processing of the photos.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

Proceedings Paper Geography, Physical

3D DIGITIZATION OF TRANSPARENT AND GLASS SURFACES: STATE OF THE ART AND ANALYSIS OF SOME METHODS

Ali Karami, Roberto Battisti, Fabio Menna, Fabio Remondino

Summary: This paper provides a general overview of the rising need for high-resolution 3D information in the field of industrial metrology for micro-measurements and quality control of transparent objects. It explores the challenges of optical-based 3D reconstruction methods and systems for such objects and reviews various approaches that have been developed to overcome these challenges. The paper also presents 3D results to demonstrate the advantages and disadvantages of each method in dealing with transparent objects.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

Proceedings Paper Geography, Physical

A NOVEL GEOMETRIC KEY-FRAME SELECTION METHOD FOR VISUAL-INERTIAL SLAM AND ODOMETRY SYSTEMS

A. Azimi, A. Hosseininaveh, F. Remondino

Summary: This paper proposes a novel geometric method for key-frame selection based on ORB-SLAM3, which selects key-frames in a completely flexible way regardless of the environment, data, and scene conditions, according to the physics and geometry of the environment. The proposed method is evaluated qualitatively and quantitatively, showing a significant improvement in positioning accuracy, despite an increase in processing time.

XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II (2022)

No Data Available