Journal
APPLIED INTELLIGENCE
Volume 52, Issue 6, Pages 6208-6226Publisher
SPRINGER
DOI: 10.1007/s10489-021-02713-8
Keywords
Salient object detection; Convolutional neural networks; Contextual feature guidance; Residual attention mechanism
Categories
Funding
- National Natural Science Foundation of China [62002100, 61802111]
- Science and Technology Foundation of Henan Province of China [212102210156]
Ask authors/readers for more resources
High-level semantic features and low-level detail features play important roles in salient object detection in fully convolutional neural networks (FCNs). This paper proposes a residual attention learning strategy and a multistage refinement mechanism to gradually improve the coarse prediction. Through integrating low-level detailed features and high-level semantic features, and employing a residual attention mechanism module to enhance feature maps, the proposed method significantly outperforms 15 state-of-the-art methods in various evaluation metrics on benchmark datasets.
High-level semantic features and low-level detail features matter for salient object detection in fully convolutional neural networks (FCNs). Further integration of low-level and high-level features increases the ability to map salient object features. In addition, different channels in the same feature are not of equal importance to saliency detection. In this paper, we propose a residual attention learning strategy and a multistage refinement mechanism to gradually refine the coarse prediction in a scale-by-scale manner. First, a global information complementary (GIC) module is designed by integrating low-level detailed features and high-level semantic features. Second, to extract multiscale features of the same layer, a multiscale parallel convolutional (MPC) module is employed. Afterwards, we present a residual attention mechanism module (RAM) to receive the feature maps of adjacent stages, which are from the hybrid feature cascaded aggregation (HFCA) module. The HFCA aims to enhance feature maps, which reduce the loss of spatial details and the impact of varying the shape, scale and position of the object. Finally, we adopt multiscale cross-entropy loss to guide network learning salient features. Experimental results on six benchmark datasets demonstrate that the proposed method significantly outperforms 15 state-of-the-art methods under various evaluation metrics.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available