☆ 4.3 Article

Adaptive region aggregation for multi-view stereo matching using deformable convolutional networks

PHOTOGRAMMETRIC RECORD (2023)

Journal

PHOTOGRAMMETRIC RECORD

Volume -, Issue -, Pages -

Publisher

WILEY

DOI: 10.1111/phor.12459

Keywords

adaptive region aggregation; deformable convolutional network; dense matching; multi-view stereo

Categories

Geography, Physical Geosciences, Multidisciplinary Remote Sensing Imaging Science & Photographic Technology

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a learnable adaptive region aggregation method based on deformable convolutional networks (DCNs) for efficient three-dimensional reconstruction in multi-view stereo (MVS) applications. The method integrates a DCN into the feature extraction workflow of MVSNet and introduces a dedicated offset regulariser to promote the convergence of the learnable offsets of the DCN. The effectiveness of the proposed method is validated through quantitative and qualitative evaluations on benchmark datasets.

Deep-learning methods have demonstrated promising performance in multi-view stereo (MVS) applications. However, it remains challenging to apply a geometrical prior on the adaptive matching windows to achieve efficient three-dimensional reconstruction. To address this problem, this paper proposes a learnable adaptive region aggregation method based on deformable convolutional networks (DCNs), which is integrated into the feature extraction workflow for MVSNet method that uses coarse-to-fine structure. Following the conventional pipeline of MVSNet, a DCN is used to densely estimate and apply transformations in our feature extractor, which is composed of a deformable feature pyramid network (DFPN). Furthermore, we introduce a dedicated offset regulariser to promote the convergence of the learnable offsets of the DCN. The effectiveness of the proposed DFPN is validated through quantitative and qualitative evaluations on the BlendedMVS and Tanks and Temples benchmark datasets within a cross-dataset evaluation setting.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Deformable convolutions in multi-view stereo

Juliano Emir Nunes Masson, Marcelo Roberto Petry, Daniel Ferreira Coutinho, Leonardo de Mello Honorio

Summary: Multi-View Stereo (MVS) is a critical step in photogrammetry, relying on the ability to match features in different images. Convolutional Neural Networks have been used to solve this problem, but they consume a large amount of Video RAM. This study reduces GPU memory usage and introduces deformable convolutions to improve the performance.

IMAGE AND VISION COMPUTING (2022)

Add to Collection

Article Chemistry, Multidisciplinary

Multi-Scale Aggregation Stereo Matching Network Based on Dense Grouping Atrous Convolution

Qijie Zou, Jie Zhang, Shuang Chen, Bing Gao, Jing Qin, Aotian Dong

Summary: The key to image depth estimation is accurately finding corresponding points between the left and right images. Binocular cameras can directly estimate the depth of the image range, avoiding the need for target recognition accuracy in monocular depth estimation. However, accurately segmenting objects and finding matching points in the ill-posed areas of the left and right images is difficult for binocular stereo matching.

APPLIED SCIENCES-BASEL (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Implicit neural refinement based multi-view stereo network with adaptive correlation

Boyang Song, Xiaoguang Hu, Jin Xiao, Guofeng Zhang, Tianyou Chen

Summary: In this paper, the authors propose an end-to-end trainable framework called ACINR-MVSNet for multi-view stereo (MVS) with adaptive group-wise correlation and implicit neural depth refinement. The framework consists of a one-stage MVS architecture followed by refinement modules and an implicit neural refinement module. An adaptive group-wise correlation similarity measure is proposed to solve visibility problems, and a pyramid-based feature extraction network is utilized to gather context-aware information. The experiments demonstrate the effectiveness and generalization of the proposed approach.

IMAGE AND VISION COMPUTING (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Domain-adaptive modules for stereo matching network

Zhi Ling, Kai Yang, Jinlong Li, Yu Zhang, Xiaorong Gao, Lin Luo, Liming Xie

Summary: This paper investigates the inherent factor hindering the adaptive performance of stereo matching networks and proposes a domain-adaptive feature extractor and feature normalization method. Furthermore, the influence of various modules on the performance of the domain-adaptive network is explored.

NEUROCOMPUTING (2021)

Add to Collection

Article Chemistry, Multidisciplinary

Adaptive Deconvolution-Based Stereo Matching Net for Local Stereo Matching

Xin Ma, Zhicheng Zhang, Danfeng Wang, Yu Luo, Hui Yuan

Summary: In deep learning-based local stereo matching, larger image patches improve accuracy, but unrestricted enlargement leads to saturation. This study proposes an adaptive deconvolution-based disparity matching network by simplifying Siamese convolutional network and adding deconvolution layers, achieving a good trade-off between accuracy and complexity.

APPLIED SCIENCES-BASEL (2022)

Add to Collection

Article Geosciences, Multidisciplinary

A-SATMVSNet: An attention-aware multi-view stereo matching network based on satellite imagery

Li Lin, Yuanben Zhang, Zongji Wang, Lili Zhang, Xiongfei Liu, Qianqian Wang

Summary: This paper proposes a satellite image stereo matching network based on attention mechanism to improve the accuracy of stereo matching results. By introducing a new feature extraction module and attention mechanism, this method effectively solves the problems of insufficient surface feature extraction and matching errors. Experimental results demonstrate the superiority of the proposed method in satellite image stereo matching.

FRONTIERS IN EARTH SCIENCE (2023)

Add to Collection

Article Engineering, Electrical & Electronic

Dense-CNN: Dense convolutional neural network for stereo matching using multiscale feature connection

Congxuan Zhang, Junjie Wu, Zhen Chen, Wen Liu, Ming Li, Shaofeng Jiang

Summary: The Dense-CNN stereo matching method proposed in this paper utilizes a novel densely connected network with multiscale convolutional layers and a new loss-function strategy to address image feature loss issues. Experimental results show superior performance in computational accuracy and robustness compared to state-of-the-art approaches.

SIGNAL PROCESSING-IMAGE COMMUNICATION (2021)

Add to Collection

Article Computer Science, Software Engineering

Adaptive depth estimation for pyramid multi-view stereo

Jie Liao, Yanping Fu, Qingan Yan, Fei Luo, Chunxia Xiao

Summary: This paper proposes a MVS network for efficient high-resolution depth estimation by adaptively refining and upsampling the depth map to the desired resolution, reducing excessive computation on accurate positions. Experimental results show that the method can generate comparable results with state-of-the-art learning methods, reconstructing more geometric details and consuming less GPU memory.

COMPUTERS & GRAPHICS-UK (2021)

Add to Collection

Article Computer Science, Information Systems

PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention

Ke Zhang, Mengyu Liu, Jinlai Zhang, Zhenbiao Dong

Summary: This paper introduces a new multi-view stereo network with a pyramid attention module to enhance feature representation of Point-MVSNet. Experimental results show that our method outperforms existing state-of-the-art methods on the DTU dataset, demonstrating its effectiveness.

IEEE ACCESS (2021)

Add to Collection

Article Computer Science, Information Systems

HighRes-MVSNet: A Fast Multi-View Stereo Network for Dense 3D Reconstruction From High-Resolution Images

Rafael Weilharter, Friedrich Fraundorfer

Summary: This study introduces an end-to-end deep learning architecture for 3D reconstruction, focusing on reducing memory requirements to utilize information from high-resolution images. By limiting the depth search range and utilizing a pyramid structure to gradually search for depth correspondences, the method can generate highly accurate 3D models using less GPU memory and runtime.

IEEE ACCESS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Improved quadruple sparse census transform and adaptive multi-shape aggregation algorithms for precise stereo matching

Chih-Hsuan Huang, Jar-Ferr Yang

Summary: The paper introduces two new stereo matching algorithms that improve the census function and matching aggregation strategy, leading to higher matching accuracy. Experimental results demonstrate that the proposed system outperforms existing stereo matching algorithms in terms of accuracy.

IET COMPUTER VISION (2022)

Add to Collection

Article Computer Science, Interdisciplinary Applications

Multi-Modal Tumor Segmentation With Deformable Aggregation and Uncertain Region Inpainting

Yue Zhang, Chengtao Peng, Ruofeng Tong, Lanfen Lin, Yen-Wei Chen, Qingqing Chen, Hongjie Hu, S. Kevin Zhou

Summary: In this paper, we propose a novel multi-modal tumor segmentation method with deformable feature fusion and uncertain region refinement to address the deficiencies of known methods. Experimental results demonstrate that our method achieves promising tumor segmentation results and outperforms state-of-the-art methods.

IEEE TRANSACTIONS ON MEDICAL IMAGING (2023)

Add to Collection

Article Computer Science, Information Systems

Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network

Zhenguo Liu, Zhao Li, Wengang Ao, Shaoshuang Zhang, Wenlong Liu, Yizhi He

Summary: Compared to 3D convolution, 2D convolution is less computationally expensive and faster in stereo matching methods. However, the initial cost volume generated by 2D convolution lacks rich information, resulting in lower robustness and accuracy in the disparity map affected by illumination. To address this, the proposed MCAFNet utilizes multi-scale adaptive cost attention and adaptive fusion to enrich the cost volume. With the improvements, the model achieves better performance in terms of EPE and error matching rates on different datasets.

ELECTRONICS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Multi-hierarchy feature extraction and multi-step cost aggregation for stereo matching

Aixin Chong, Hui Yin, Yanting Liu, Jin Wan, Zhihao Liu, Ming Han

Summary: Compared with traditional hand-crafted feature based methods, learning-based stereo matching methods have made significant progress in matching accuracy. However, current CNN-based methods often require a substantial amount of time and memory consumption. To address this issue, we propose an accurate and fast stereo matching network that incorporates multi-hierarchy feature extraction and multi-step cost aggregation. Experimental results demonstrate that our network achieves highly competitive disparity estimation accuracy with fast inference speed.

NEUROCOMPUTING (2022)

Add to Collection

Article Optics

Occlusion-aware light field depth estimation with view attention

Xucheng Wang, Chenning Tao, Zhenrong Zheng

Summary: A two-stage attention-based occlusion-aware light field depth estimation network is proposed in this study, which can achieve accurate depth estimation in occluded regions and ranks first in the 4D light field benchmark.

OPTICS AND LASERS IN ENGINEERING (2023)

Add to Collection

No Data Available

No Data Available

© Peeref 2019-2024. All rights reserved.