4.7 Article

Scene recognition: A comprehensive survey

期刊

PATTERN RECOGNITION
卷 102, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2020.107205

关键词

Scene recognition; Patch feature encoding; Spatial layout pattern learning; Discriminative region detection; Convolutional neural networks; Deep learning

资金

  1. Programme for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning
  2. JSPS KAKENHI [15K00159]
  3. Grants-in-Aid for Scientific Research [15K00159] Funding Source: KAKEN

向作者/读者索取更多资源

With the success of deep learning in the field of computer vision, object recognition has made important breakthroughs, and its recognition accuracy has been drastically improved. However, the performance of scene recognition is still not sufficient to some extent because of complex configurations. Over the past several years, scene recognition algorithms have undergone important evolution as a result of the development of machine learning and Deep Convolutional Neural Networks (DCNN). This paper reviews many of the most popular and effective approaches to scene recognition, which is expected to create benefits for future research and practical applications. We seek to establish relationships among different algorithms and determine the critical components that lead to remarkable performance. Through the analysis of some representative schemes, motivation and insights are identified, which will help to facilitate the design of better recognition architectures. In addition, current available scene datasets and benchmarks are presented for evaluation and comparison. Finally, potential problems and promising directions are highlighted. (C) 2020 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

Hierarchical Coding of Convolutional Features for Scene Recognition

Lin Xie, Feifei Lee, Li Liu, Zhong Yin, Qiu Chen

IEEE TRANSACTIONS ON MULTIMEDIA (2020)

Article Chemistry, Analytical

A Supervised Video Hashing Method Based on a Deep 3D Convolutional Neural Network for Large-Scale Video Retrieval

Hanqing Chen, Chunyan Hu, Feifei Lee, Chaowei Lin, Wei Yao, Lu Chen, Qiu Chen

Summary: This study introduces a content-based video retrieval system using the deep supervised video hashing (DSVH) framework, and demonstrates its advantages through experiments on different datasets.

SENSORS (2021)

Article Computer Science, Artificial Intelligence

An joint end-to-end framework for learning with noisy labels

Qian Zhang, Feifei Lee, Ya-gang Wang, Damin Ding, Wei Yao, Lu Chen, Qiu Chen

Summary: The paper introduces a novel end-to-end framework for noise correction, which can completely correct noisy labels to true labels and keep the number of each class more balanced without requiring any extra conditions. Experimental results show that the proposed method outperforms other state-of-the-art methods on publicly available CIFAR-10, CIFAR-100 and Clothing1M datasets.

APPLIED SOFT COMPUTING (2021)

Article Computer Science, Information Systems

CJC-net: A cyclical training method with joint loss and co-teaching strategy net for deep learning under noisy labels

Qian Zhang, Feifei Lee, Ya-gang Wang, Damin Ding, Shuai Yang, Chaowei Lin, Qiu Chen

Summary: The paper introduces a novel framework CJC-net for learning with noisy labels, which utilizes cyclical training with joint loss and co-teaching strategy to help networks transition from overfitting to underfitting states, thereby improving the accuracy in identifying noisy labels and hard samples.

INFORMATION SCIENCES (2021)

Article Computer Science, Artificial Intelligence

Scene recognition using multiple representation network

Chaowei Lin, Feifei Lee, Lin Xie, Jiawei Cai, Hanqing Chen, Li Liu, Qiu Chen

Summary: In this paper, a comprehensive representation for scene recognition is proposed, which includes enhanced global scene representation, local salient scene representation, and local contextual object representation. The multiple representations are constructed using two pretrained CNNs and specific techniques, and they are generated by an end-to-end trainable model. Experimental results show that the proposed model outperforms existing models.

APPLIED SOFT COMPUTING (2022)

Article Computer Science, Interdisciplinary Applications

An efficient U-shaped network combined with edge attention module and context pyramid fusion for skin lesion segmentation

Bin Zuo, Feifei Lee, Qiu Chen

Summary: Skin lesion segmentation is crucial in skin diagnosis. This paper proposes a novel U-shaped network called EAM-CPFNet, which combines edge attention module (EAM) and context pyramid fusion (CPF) to improve the performance of skin lesion segmentation. Experimental results show that the proposed method is competitive on a publicly available dataset.

MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING (2022)

Article Computer Science, Artificial Intelligence

HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

Zhihong Yu, Feifei Lee, Qiu Chen

Summary: In this paper, a hybrid CNN-Transformer model based on a neural architecture search network (HCT-Net) is proposed, which can capture feature information, save time and human energy, and reduce GPU memory consumption and complexity. The model achieves competitive precision and efficiency on various medical image segmentation datasets, and its generalization is validated on unseen datasets.

APPLIED INTELLIGENCE (2023)

Article Chemistry, Multidisciplinary

DMA-Net: Decoupled Multi-Scale Attention for Few-Shot Object Detection

Xijun Xie, Feifei Lee, Qiu Chen

Summary: This paper proposes a novel framework called DMA-Net, which utilizes the DMAM module to perform multi-scale feature extraction and information fusion, and the DGM module to reduce the impact of information exchange between branches. DMA-Net achieves incremental FSOD and demonstrates state-of-the-art performance in this setting.

APPLIED SCIENCES-BASEL (2023)

Article Computer Science, Information Systems

TCC-net: A two-stage training method with contradictory loss and co-teaching based on meta-learning for learning with noisy labels

Qiangqiang Xia, Feifei Lee, Qiu Chen

Summary: With the development of deep neural networks, there is a growing demand for accurately labeled datasets. However, human-labeled datasets often have mistakes, leading to misleading information. This paper proposes a two-stage learning framework called TCC-net to address the issue of learning with noisy labels. The experimental results show that TCC-net outperforms other state-of-the-art methods on corrupted datasets.

INFORMATION SCIENCES (2023)

Article Computer Science, Information Systems

Fall Detection System on Smart Walker Based on Multisensor Data Fusion and SPRT Method

Da-Min Ding, Ya-Gang Wang, Wei Zhang, Qiu Chen

Summary: This paper proposes an improved fall detection method on a smart walker based on IMU sensors and image recognition, which can improve detection accuracy and real-time performance, making it an ideal solution.

IEEE ACCESS (2022)

Article Computer Science, Information Systems

SRUNet: Stacked Reversed U-Shape Network for Lightweight Single Image Super-Resolution

Zhiyi Feng, Feifei Lee, Qiu Chen

Summary: In recent years, lightweight models have been successfully applied to single image super-resolution tasks, but most models fail to fully utilize multi-scale features. To address this issue, we propose a stacked reversed U-shape network (SRUNet) that progressively performs upsampling and downsampling operations to extract richer multi-scale features. Additionally, we introduce dense connections and fusion modules for better utilization of multi-scale features.

IEEE ACCESS (2022)

Article Computer Science, Information Systems

MOO-DNAS: Efficient Neural Network Design via Differentiable Architecture Search Based on Multi-Objective Optimization

Hui Wei, Feifei Lee, Chunyan Hu, Qiu Chen

Summary: This paper proposes an efficient CNN architecture search framework, MOO-DNAS, based on multi-objective optimization. The framework aims to find an efficient model by balancing classification accuracy and network latency, utilizing a novel factorized hierarchical search space and a robust hard-sampling strategy.

IEEE ACCESS (2022)

Article Computer Science, Information Systems

NAEM: Noisy Attention Exploration Module for Deep Reinforcement Learning

Zhenwen Cai, Feifei Lee, Chunyan Hu, Koji Kotani, Qiu Chen

Summary: The paper introduces a novel lightweight and general neural network module NAEM, which achieves significant performance improvement in deep reinforcement learning by introducing Gaussian noise into the attention mechanism for global exploration.

IEEE ACCESS (2021)

Article Computer Science, Information Systems

An Improved Capsule Network Based on Capsule Filter Routing

Wei Wang, Feifei Lee, Shuai Yang, Qiu Chen

Summary: CFR-CapsNet is an improved capsule network that uses CFR and self-attention mechanism to enhance CapsNet performance, and improves network structure relevance through information transmission. Experimental results show that this method can effectively improve the performance of CapsNet.

IEEE ACCESS (2021)

Article Computer Science, Artificial Intelligence

Exploiting sublimated deep features for image retrieval

Guang-Hai Liu, Zuo-Yong Li, Jing-Yu Yang, David Zhang

Summary: This article introduces a novel image retrieval method that improves retrieval performance by using sublimated deep features. The method incorporates orientation-selective features and color perceptual features, effectively mimicking these mechanisms to provide a more discriminating representation.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Region-adaptive and context-complementary cross modulation for RGB-T semantic segmentation

Fengguang Peng, Zihan Ding, Ziming Chen, Gang Wang, Tianrui Hui, Si Liu, Hang Shi

Summary: RGB-Thermal (RGB-T) semantic segmentation is an emerging task that aims to improve the robustness of segmentation methods under extreme imaging conditions by using thermal infrared modality. The challenges of foreground-background distinguishment and complementary information mining are addressed by proposing a cross modulation process with two collaborative components. Experimental results show that the proposed method achieves state-of-the-art performances on current RGB-T segmentation benchmarks.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

F-SCP: An automatic prompt generation method for specific classes based on visual language pre-training models

Baihong Han, Xiaoyan Jiang, Zhijun Fang, Hamido Fujita, Yongbin Gao

Summary: This paper proposes a novel automatic prompt generation method called F-SCP, which focuses on generating accurate prompts for low-accuracy classes and similar classes. Experimental results show that our approach outperforms state-of-the-art methods on six multi-domain datasets.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Residual Deformable Convolution for better image de-weathering

Huikai Liu, Ao Zhang, Wenqian Zhu, Bin Fu, Bingjian Ding, Shengwu Xiong

Summary: Adverse weather conditions present challenges for computer vision tasks, and image de-weathering is an important component of image restoration. This paper proposes a multi-patch skip-forward structure and a Residual Deformable Convolutional module to improve feature extraction and pixel-wise reconstruction.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

A linear transportation LP distance for pattern recognition

Oliver M. Crook, Mihai Cucuringu, Tim Hurst, Carola-Bibiane Schonlieb, Matthew Thorpe, Konstantinos C. Zygalakis

Summary: The transportation LP distance (TLP) is a generalization of the Wasserstein WP distance that can be applied directly to color or multi-channelled images, as well as multivariate time-series. TLP interprets signals as functions, while WP interprets signals as measures. Although both distances are powerful tools in modeling data with spatial or temporal perturbations, their computational cost can be prohibitively high for moderate pattern recognition tasks. The linear Wasserstein distance offers a method for projecting signals into a Euclidean space, and in this study, we propose linear versions of the TLP distance (LTLP) that show significant improvement over the linear WP distance in signal processing tasks while being several orders of magnitude faster to compute than the TLP distance.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Learning a target-dependent classifier for cross-domain semantic segmentation: Fine-tuning versus meta-learning

Haitao Tian, Shiru Qu, Pierre Payeur

Summary: This paper proposes a method of target-dependent classifier, which optimizes the joint hypothesis of domain adaptation into a target-dependent hypothesis that better fits with the target domain clusters through an unsupervised fine-tuning strategy and the concept of meta-learning. Experimental results demonstrate that this method outperforms existing techniques in synthetic-to-real adaptation and cross-city adaptation benchmarks.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

KGSR: A kernel guided network for real-world blind super-resolution

Qingsen Yan, Axi Niu, Chaoqun Wang, Wei Dong, Marcin Wozniak, Yanning Zhang

Summary: Deep learning-based methods have achieved remarkable results in the field of super-resolution. However, the limitation of paired training image sets has led researchers to explore self-supervised learning. However, the assumption of inaccurate downscaling kernel functions often leads to degraded results. To address this issue, this paper introduces KGSR, a kernel-guided network that trains both upscaling and downscaling networks to generate high-quality high-resolution images even without knowing the actual downscaling process.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Gait feature learning via spatio-temporal two-branch networks

Yifan Chen, Xuelong Li

Summary: Gait recognition is a popular technology for identification due to its ability to capture gait features over long distances without cooperation. However, current methods face challenges as they use a single network to extract both temporal and spatial features. To solve this problem, we propose a two-branch network that focuses on spatial and temporal feature extraction separately. By combining these features, we can effectively learn the spatio-temporal information of gait sequences.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

PAMI: Partition Input and Aggregate Outputs for Model Interpretation

Wei Shi, Wentao Zhang, Wei-shi Zheng, Ruixuan Wang

Summary: This article proposes a simple yet effective visualization framework called PAMI, which does not require detailed model structure and parameters to obtain visualization results. It can be applied to various prediction tasks with different model backbones and input formats.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Disturbance rejection with compensation on features

Xiaobo Hu, Jianbo Su, Jun Zhang

Summary: This paper reviews the latest technologies in pattern recognition, highlighting their instabilities and failures in practical applications. From a control perspective, the significance of disturbance rejection in pattern recognition is discussed, and the existing problems are summarized. Finally, potential solutions related to the application of compensation on features are discussed to emphasize future research directions.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

ECLAD: Extracting Concepts with Local Aggregated Descriptors

Andres Felipe Posada-Moreno, Nikita Surya, Sebastian Trimpe

Summary: Convolutional neural networks are widely used in critical systems, and explainable artificial intelligence has proposed methods for generating high-level explanations. However, these methods lack the ability to determine the location of concepts. To address this, we propose a novel method for automatic concept extraction and localization based on pixel-wise aggregations, and validate it using synthetic datasets.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

Dynamic Graph Contrastive Learning via Maximize Temporal Consistency

Peng Bao, Jianian Li, Rong Yan, Zhongyi Liu

Summary: In this paper, a novel Dynamic Graph Contrastive Learning framework, DyGCL, is proposed to capture the temporal consistency in dynamic graphs and achieve good performance in node representation learning.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

ConvGeN: A convex space learning approach for deep-generative oversampling and imbalanced classification of small tabular datasets

Kristian Schultz, Saptarshi Bej, Waldemar Hahn, Markus Wolfien, Prashant Srivastava, Olaf Wolkenhauer

Summary: Research indicates that deep generative models perform poorly compared to linear interpolation-based methods for synthetic data generation on small, imbalanced tabular datasets. To address this, a new approach called ConvGeN, combining convex space learning with deep generative models, has been proposed. ConvGeN improves imbalanced classification on small datasets while remaining competitive with existing linear interpolation methods.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

H-CapsNet: A capsule network for hierarchical image classification

Khondaker Tasrif Noor, Antonio Robles-Kelly

Summary: In this paper, the authors propose H-CapsNet, a capsule network designed for hierarchical image classification. The network effectively captures hierarchical relationships using dedicated capsules for each class hierarchy. A modified hinge loss is utilized to enforce consistency among the involved hierarchies. Additionally, a strategy for dynamically adjusting training parameters is presented to achieve better balance between the class hierarchies. Experimental results demonstrate that H-CapsNet outperforms competing hierarchical classification networks.

PATTERN RECOGNITION (2024)

Article Computer Science, Artificial Intelligence

CS-net: Conv-simpleformer network for agricultural image segmentation

Lei Liu, Guorun Li, Yuefeng Du, Xiaoyu Li, Xiuheng Wu, Zhi Qiao, Tianyi Wang

Summary: This study proposes a new agricultural image segmentation model called CS-Net, which uses Simple-Attention Block and Simpleformer to improve accuracy and inference speed, and addresses the issue of performance collapse of Transformers in agricultural image processing.

PATTERN RECOGNITION (2024)