Review
Computer Science, Information Systems
Yuzhu Ji, Haijun Zhang, Zhao Zhang, Ming Liu
Summary: This paper investigates the application of convolutional neural network-based encoder-decoder models in the field of salient object detection. Through extensive experimental research on encoder-decoder models with different parameters, new baseline models that can outperform state-of-the-art performance were discovered.
INFORMATION SCIENCES
(2021)
Article
Computer Science, Software Engineering
Dibyendu Kumar Das, Sahadeb Shit, Dip Narayan Ray, Somajyoti Majumder
Summary: This study proposes a two-way learning network, where Closure-guided Attention Network (CGAN) and the Coarse Saliency Networks (CSN) jointly supervise the feature-channel to mitigate the simplicity bias. A channel-wise attention residual network is incorporated in the Closure Guided module to alleviate the scale-space problem and generate smooth object contour. The closure map from CGAN fused with the coarse saliency map of the Coarse Saliency Network generates a salient object, showing significant improvements over the state-of-the-art method in experimental results on five benchmark datasets.
Article
Engineering, Electrical & Electronic
Liqian Zhang, Qing Zhang, Rui Zhao
Summary: In this paper, a novel progressive dual-attention residual network (PDRNet) is proposed to refine prediction in a coarse-to-fine manner by exploiting two complementary attention maps. It also utilizes a hierarchical feature screening module (HFSM) to enhance global contextual knowledge for salient object detection. Experimental results show that our proposed PDRNet outperforms 18 state-of-the-art methods on benchmark datasets.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Computer Science, Artificial Intelligence
Ming-Ming Cheng, Shang-Hua Gao, Ali Borji, Yong-Qiang Tan, Zheng Lin, Meng Wang
Summary: This study proposes a lightweight salient object detection model and investigates the semantics of SOD models. By reducing representation redundancy and using a dynamic weight decay scheme, the model achieves comparable performance to state-of-the-art with significantly fewer parameters. The study shows that SOD and classification methods use different mechanisms, SOD models are category-insensitive, and SOD training does not require ImageNet pre-training.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Zhengzheng Tu, Zhun Li, Chenglong Li, Jin Tang
Summary: In this study, we propose a novel deep correlation network for RGBT Salient Object Detection (SOD). The network explores the correlations between RGB and thermal modalities, and incorporates a modality alignment module and a bi-directional decoder model to handle unaligned image pairs and enhance feature representation. Experimental results show that our method outperforms state-of-the-art methods on three benchmark datasets.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2022)
Article
Computer Science, Information Systems
Jing-Ming Guo, Herleeyandi Markoni
Summary: This study proposes a novel approach that utilizes the original image gradient as a guide to detect and refine saliency, aiming to reduce computational cost and improve the stability and accuracy of salient object detection results.
Article
Computer Science, Artificial Intelligence
Jia Li, Shengye Qiao, Zhirui Zhao, Chenxi Xie, Xiaowu Chen, Changqun Xia
Summary: This article introduces a lightweight framework for salient object detection, which addresses the dilution of semantic context, loss of spatial structure, and absence of boundary detail by decoupling the U-shape structure into three branches. The proposed Scale-Adaptive Pooling Module is used to obtain multi-scale receptive field. Experimental results demonstrate that the method achieves a better balance between efficiency and accuracy.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2023)
Article
Automation & Control Systems
Jie Wang, Kechen Song, Yanqi Bao, Yunhui Yan, Yahong Han
Summary: This paper introduces a unidirectional RGB-T salient object detection network with intertwined driving of encoding and fusion. By using transformer as the network backbone, it solves the problem of CNNs' difficulty in establishing long-range dependencies. Furthermore, by constructing a unidirectional architecture and using local detail-driven modules, it improves the drawbacks of the encoder-decoder architecture and enhances the performance of the network.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Engineering, Electrical & Electronic
Guibiao Liao, Wei Gao, Ge Li, Junle Wang, Sam Kwong
Summary: This article proposes a novel CCFENet model for RGB-T salient object detection. The model addresses the issue of defective modalities using a cross-collaboration enhancement strategy (CCE) and aggregates multi-level complementary multi-modal features using a cross-scale cross-modal decoder (CCD). Experimental results demonstrate that CCFENet outperforms existing models on multiple datasets.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2022)
Article
Automation & Control Systems
Shuo Li, Fang Liu, Licheng Jiao, Xu Liu, Puhua Chen
Summary: This paper introduces an unsupervised salient object detection method that achieves salient object detection by learning salient features from the data itself. The method enhances salient features, suppresses nonsalient features, and roughly locates the salient features to obtain the salient activation map. A saliency map update strategy is then used to remove noise and strengthen boundaries. The results show that the proposed method can effectively learn salient visual objects.
IEEE TRANSACTIONS ON CYBERNETICS
(2023)
Article
Computer Science, Artificial Intelligence
Xian Fang, Jinchao Zhu, Xiuli Shao, Hongpeng Wang
Summary: In this paper, we propose a novel network model LC(3)Net, equipped with the components of FCB, DCM, and BCD, to address the issues in utilizing contextual information. Extensive experiments demonstrate the superior performance of our method compared to 20 state-of-the-art methods.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Engineering, Electrical & Electronic
Fan Sun, Wujie Zhou, Lv Ye, Lu Yu
Summary: This paper introduces a hierarchical decoding network based on Swin Transformer for red-green-blue and thermal (RGB-T) salient object detection. Compared to conventional deep convolutional neural networks, this network can better capture global information of an image and is more effective in capturing semantic associations over longer ranges.
IEEE SIGNAL PROCESSING LETTERS
(2022)
Article
Computer Science, Artificial Intelligence
Chang Xu, Qingwu Li, Qingkai Zhou, Xiongbiao Jiang, Dabing Yu, Yaqin Zhou
Summary: RGB-thermal salient object detection has unique advantages in handling challenging scenes, but existing methods often overlook the differences between imaging mechanisms and thermal image characteristics, resulting in unsatisfactory performance. To address this, an asymmetric cross-modal activation network is proposed to achieve more effective RGB-T SOD by exploiting the interactions of modality-specific features.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Environmental Sciences
Zhihong Zeng, Haijun Liu, Fenglei Chen, Xiaoheng Tan
Summary: In this paper, a compensated attention feature fusion and hierarchical multiplication decoder network (CAF-HMNet) is proposed for RGB-D salient object detection. By fusing multi-modal features and refining features in a top-down manner, the detection accuracy is improved, and an object contour-aware module is applied to enhance object contour.
Article
Engineering, Electrical & Electronic
Qing Zhang, Rui Zhao, Liqian Zhang
Summary: Due to the rapid development of deep learning, the salient object detection has made significant progress, but the problem of constructing a powerful saliency detection network to generate saliency maps that highlight salient objects and suppress background noise effectively is still a challenging issue. In this paper, a novel trifurcated cascaded refinement network (TCRNet) is proposed to explore multi-level feature fusion and global information representation. The proposed network performs favorably against 20 state-of-the-art salient object detection methods on five benchmark datasets, demonstrating its effectiveness and superiority.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2023)
Article
Automation & Control Systems
Han Wang, Kechen Song, Liming Huang, Hongwei Wen, Yunhui Yan
Summary: RGB-T salient object detection has achieved rapid development and excellent results in recent years. However, the current RGB-T datasets lack low-illumination data, leading to poor performance in detecting salient objects in extremely low-illumination scenes. To address this issue, we propose a T-aware guided early fusion network that leverages thermal images to enhance the detection performance of low-illumination data.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Review
Computer Science, Artificial Intelligence
Hongkun Tian, Kechen Song, Song Li, Shuai Ma, Jing Xu, Yunhui Yan
Summary: This paper presents a comprehensive survey of data-driven robotic visual grasping detection (DRVGD) for unknown objects. It reviews both object-oriented and scene-oriented aspects, providing detailed information about associated grasping representations and datasets. The challenges of DRVGD and future directions are also pointed out.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Engineering, Multidisciplinary
Wenli Zhao, Kechen Song, Yanyan Wang, Shubo Liang, Yunhui Yan
Summary: This paper proposes a feature-aware network (FaNet) for few shot defect classification, which can effectively distinguish new classes with a small number of labeled samples. In FaNet, ResNet12 is used as the baseline, and the feature-attention convolution module (FAC) is applied to extract comprehensive feature information from the base classes. An online feature-enhance integration module (FEI) is adopted during the test phase to average the noise from defect images, further enhancing image features among different tasks. In addition, a large-scale strip steel surface defects few shot classification dataset (FSC-20) with 20 different types is constructed. Experimental results show that the proposed method achieves the best performance compared to state-of-the-art methods for 5-way 1-shot and 5-way 5-shot tasks. The dataset and code are available at: https://github.com/VDT-2048/FSC-20.
Article
Chemistry, Multidisciplinary
Xiangrong Li, Zeqing Cheng, Ruonan Xu, Ziyang Wang, Li Shi, Yunhui Yan
Summary: Spherical silver nanoparticles (AgNPs) with a mean diameter of 50.4 nm were prepared using sodium citrate reduction. The interaction mechanism between AgNPs and gamma-globulin, fibrinogen, and hyaluronidase (HAase) was investigated. The results showed that AgNPs effectively quenched the intrinsic fluorescence of gamma-globulin, fibrinogen, and HAase through a static quenching mechanism. The binding constant and Hill coefficient indicated the order of interaction strength to be fibrinogen-AgNPs > gamma-globulin-AgNPs > HAase-AgNPs. The interaction between gamma-globulin/fibrinogen and AgNPs was driven by enthalpy and hydrophobic interaction, while the interaction between HAase and AgNPs was driven by entropy and van der Waals force and hydrogen bonding.
NEW JOURNAL OF CHEMISTRY
(2023)
Article
Computer Science, Artificial Intelligence
Kuan Wang, Jing Xu, Kechen Song, Yunhui Yan, Yihang Peng
Summary: This paper proposes Informed Anytime Bi-directional Fast Marching Tree (IABFMT*), an anytime asymptotically-optimal sampling-based algorithm that combines the strengths of BFMT* and IAFMT*. It performs a bi-directional lazy search to efficiently find a feasible solution and improve it quickly. Additionally, graph pruning and heuristic cost evaluation techniques are implemented to reduce unnecessary computations and improve convergence rate. Simulation results in OMPL demonstrate the superior efficiency of IABFMT* compared to other state-of-the-art algorithms in complex cluttered environments.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Automation & Control Systems
Kechen Song, Jie Wang, Yanqi Bao, Liming Huang, Yunhui Yan
Summary: Visual perception is crucial for industrial information field, specifically in robotic grasping application. To achieve fast and accurate object detection for grasping, salient object detection (SOD) is employed. However, existing SOD methods still have limitations in practical application due to complex interference. To address this, a novel triple-modal images fusion strategy called visible-depth-thermal (VDT) SOD is proposed. Experimental results demonstrate that our method outperforms state-of-the-art approaches.
IEEE-ASME TRANSACTIONS ON MECHATRONICS
(2023)
Article
Computer Science, Information Systems
Liangliang Feng, Kechen Song, Junyi Wang, Yunhui Yan
Summary: Siamese tracking is a promising object tracking method that aims to improve robustness by introducing infrared data as an aid. However, current RGBT trackers have limitations in terms of operational efficiency. In this paper, an end-to-end Siamese RGBT tracking framework is proposed, which utilizes cross-modal feature enhancement and self-attention to effectively exploit the potential of Siamese tracking. The proposed framework achieved state-of-the-art performance on benchmark datasets while running in real-time.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION
(2023)
Article
Computer Science, Information Systems
Xuefang Nie, Yunhui Yan, Tianqing Zhou, Xingbang Chen, Dingding Zhang
Summary: Cloudlet-based vehicular networks are proposed to enhance computation services by using a distributed computation method. A parallel task scheduling strategy based on multi-agent deep reinforcement learning (DRL) approach is presented to further improve the computing efficiency and reduce the task processing delay. The experiment results demonstrate that the proposed DRL-based scheduling algorithm achieves significant performance improvement compared with traditional task scheduling algorithms.
Article
Instruments & Instrumentation
Shubo Liang, Kechen Song, Wenli Zhao, Song Li, Yunhui Yan
Summary: The infrared image super-resolution (SR) method improves the quality and efficiency of infrared cameras by reconstructing higher-resolution images. Existing methods overlook the specificity of infrared images and focus on small-scale factors. To address this, a novel infrared SR model called DASR is proposed, which incorporates a Transformer with spatial and channel dual-attention mechanisms to capture global edge structure information. Experimental results demonstrate that DASR outperforms state-of-the-art methods in terms of both visual quality and computational efficiency.
INFRARED PHYSICS & TECHNOLOGY
(2023)
Article
Engineering, Electrical & Electronic
Wenqi Cui, Kechen Song, Hu Feng, Xiujian Jia, Shaoning Liu, Yunhui Yan
Summary: Researchers propose a novel autocorrelation-aware aggregation network (A3Net) for salient object detection of strip steel surface defects. The use of attention mechanism and scale interaction module contributes to the superior performance of the proposed method on both public and newly built datasets.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
(2023)
Article
Chemistry, Analytical
Kechen Song, Yiming Zhang, Yanqi Bao, Ying Zhao, Yunhui Yan
Summary: Image segmentation is an important computer vision technique that has been widely used in various tasks. In extreme cases with insufficient illumination, the performance of the model can be greatly affected, leading to the use of multi-modal images in fully supervised methods. Obtaining dense annotated large datasets is difficult, but satisfactory results can still be achieved with few-shot methods and few pixel-annotated samples. Therefore, a Visible-Depth-Thermal (three-modal) images few-shot semantic segmentation method is proposed in this study to improve the performance of few-shot segmentation tasks by utilizing the homogeneous and complementary information of three-modal images.
Article
Engineering, Electrical & Electronic
Hu Feng, Kechen Song, Wenqi Cui, Yiming Zhang, Yunhui Yan
Summary: This article proposes a simple and effective few-shot segmentation method called CPANet, which aims to learn a network that can segment untrained S3D categories with only a few labeled defective samples. CPANet effectively aggregates long-range relationships of discrete defects using CPP and SA modules. It also introduces an SSA module to aggregate multiscale context information of defect features and suppresses interference from background information. Extensive experiments demonstrate that CPANet achieves state-of-the-art performance on the FSSD-12 dataset.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
(2023)
Article
Computer Science, Artificial Intelligence
Shuai Ma, Kechen Song, Menghui Niu, Hongkun Tian, Yanyan Wang, Yunhui Yan
Summary: This paper proposes a feature-based domain disentanglement and randomization (FDDR) framework to improve the generalization of deep models in unseen datasets. The framework successfully addresses the appearance difference issue between training and test images by decomposing the defect image into domain-invariant structural features and domain-specific style features. It also utilizes randomly generated samples for training to further expand the training sample.
ADVANCED ENGINEERING INFORMATICS
(2024)
Article
Engineering, Multidisciplinary
Tonglei Cao, Kechen Song, Likun Xu, Hu Feng, Yunhui Yan, Jingbo Guo
Summary: This study constructs a high-resolution dataset for surface defects in ceramic tiles and addresses the scale and quantity differences in defect distribution. An improved approach is proposed by introducing a content-aware feature recombination method and a dynamic attention mechanism. Experimental results demonstrate the superior accuracy and efficiency of the proposed method.