☆ 4.7 Article

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

INTERNATIONAL JOURNAL OF COMPUTER VISION (2015)

Journal

INTERNATIONAL JOURNAL OF COMPUTER VISION

Volume 115, Issue 3, Pages 330-344

Publisher

SPRINGER

DOI: 10.1007/s11263-015-0822-0

Keywords

Convolutional neural networks; Deep learning; Feature learning; Saliency detection

Categories

Computer Science, Artificial Intelligence

Funding

RGC of Hong Kong (RGC) [CityU 115112, CityU 21201914]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Existing computational models for salient object detection primarily rely on hand-crafted features, which are only able to capture low-level contrast information. In this paper, we learn the hierarchical contrast features by formulating salient object detection as a binary labeling problem using deep learning techniques. A novel superpixelwise convolutional neural network approach, called SuperCNN, is proposed to learn the internal representations of saliency in an efficient manner. In contrast to the classical convolutional networks, SuperCNN has four main properties. First, the proposed method is able to learn the hierarchical contrast features, as it is fed by two meaningful superpixel sequences, which is much more effective for detecting salient regions than feeding raw image pixels. Second, as SuperCNN recovers the contextual information among superpixels, it enables large context to be involved in the analysis efficiently. Third, benefiting from the superpixelwise mechanism, the required number of predictions for a densely labeled map is hugely reduced. Fourth, saliency can be detected independent of region size by utilizing a multiscale network structure. Experiments show that SuperCNN can robustly detect salient objects and outperforms the state-of-the-art methods on three benchmark datasets.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Saliency Detection With a Three-Stage Hierarchical Network

Dongjing Shan, Xiongwei Zhang, Tieyong Cao, Limin Wang, Chao Zhang

Summary: In this article, a three-stage hierarchical neural network is proposed for saliency detection, combining fast R-CNN, self-attention mechanism, and global regression model. Experimental results demonstrate excellent performance on several benchmark datasets and comparisons with 12 previous methods were conducted.

IEEE INTELLIGENT SYSTEMS (2021)

Add to Collection

Article Computer Science, Information Systems

An end-to-end network for co-saliency detection in one single image

Yuanhao Yue, Qin Zou, Hongkai Yu, Qian Wang, Zhongyuan Wang, Song Wang

Summary: This study proposes a novel end-to-end trainable network for co-saliency detection within a single image. The network combines bottom-up and top-down strategies by using ground-truth masks as top-down guidance and constructing triplet proposals for regional feature mapping and clustering.

SCIENCE CHINA-INFORMATION SCIENCES (2023)

Add to Collection

Article Engineering, Electrical & Electronic

Object Detection in 20 Years: A Survey

Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, Jieping Ye

Summary: Object detection, a fundamental problem in computer vision, has received significant attention in recent years. This article reviews the rapid technological evolution of object detection over the past two decades and its impact on the entire computer vision field. It covers various topics such as milestone detectors, datasets, metrics, fundamental building blocks, speedup techniques, and state-of-the-art methods.

PROCEEDINGS OF THE IEEE (2023)

Add to Collection

Review Computer Science, Information Systems

Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review

Ayoub Benali Amjoud, Mustapha Amrouch

Summary: This paper examines the evolution of object detection in the era of deep learning, reviews various state-of-the-art algorithms and their underlying concepts, and classifies them into anchor-based, anchor-free, and transformer-based detectors. The paper discusses the insights behind these algorithms and provides experimental analyses comparing quality metrics, speed/accuracy trade-offs, and training methodologies. Additionally, it compares major convolutional neural networks for object detection, highlights the strengths and limitations of each model, and summarizes the development of object detection methods under deep learning through simple graphical illustrations. Finally, the paper identifies future research directions.

IEEE ACCESS (2023)

Add to Collection

Article Medicine, General & Internal

A Deep Feature Fusion of Improved Suspected Keratoconus Detection with Deep Learning

Ali H. Al-Timemy, Laith Alzubaidi, Zahraa M. Mosa, Hazem Abdelmotaal, Nebras H. Ghaeb, Alexandru Lavric, Rossen M. Hazarbassanov, Hidenori Takahashi, Yuantong Gu, Siamak Yousefi

Summary: In this study, a deep learning model is proposed to accurately and robustly detect early clinical keratoconus (KCN). By extracting features from three different corneal maps using Xception and InceptionResNetV2 deep learning architectures, and then fusing the features, subclinical forms of KCN can be detected with high accuracy. The model achieved an AUC of 0.99 and an accuracy range of 97-100% in distinguishing normal eyes from eyes with subclinical and established KCN. The model was further validated on an independent dataset with an AUC of 0.91-0.92 and an accuracy range of 88-92%. This model is a step toward improving the detection of clinical and subclinical forms of KCN.

DIAGNOSTICS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Activity guided multi-scales collaboration based on scaled-CNN for saliency prediction

Deqiang Cheng, Ruihang Liu, Jiahan Li, Song Liang, Qiqi Kou, Kai Zhao

Summary: This study introduces a lightweight saliency prediction model based on convolutional neural networks, utilizing multi-scale collaboration learning of global and local information, achieving competitive and consistent results on challenging benchmark datasets with better prediction performance, fewer parameters, and faster inference speed.

IMAGE AND VISION COMPUTING (2021)

Add to Collection

Article Plant Sciences

TeaDiseaseNet: multi-scale self-attentive tea disease detection

Yange Sun, Fei Wu, Huaping Guo, Ran Li, Jianfeng Yao, Jianbo Shen

Summary: This paper introduces a novel method called TeaDiseaseNet for tea disease detection. It utilizes a multi-scale self-attention mechanism and a channel attention mechanism to achieve accurate detection and localization of tea disease information. Experimental results demonstrate its superior performance in scenarios with complex backgrounds and varying disease scales, highlighting its potential for intelligent tea disease diagnosis.

FRONTIERS IN PLANT SCIENCE (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Depth Injection Framework for RGBD Salient Object Detection

Shunyu Yao, Miao Zhang, Yongri Piao, Chaoyi Qiu, Huchuan Lu

Summary: This paper proposes a depth injection framework to enhance the semantic representation by injecting depth maps into the encoder. A depth injection module is also introduced to complement and guide the information between depth maps and the encoder. Experimental results show that the proposed method achieves state-of-the-art performance on multiple datasets and exhibits strong generalization ability.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2023)

Add to Collection

Article Computer Science, Information Systems

Convolution Neural Network With Coordinate Attention for the Automatic Detection of Pulmonary Tuberculosis Images on Chest X-Rays

Tianhao Xu, Zhenming Yuan

Summary: This study proposes a low-cost and automatic detection method for pulmonary tuberculosis images on chest X-rays to assist primary radiologists. By introducing coordinate attention mechanism and convolution neural network, the method achieves better accuracy in identifying and classifying pulmonary tuberculosis images. The evaluation on a public dataset shows high accuracy and recall rate, which can aid radiologists in auxiliary diagnosis.

IEEE ACCESS (2022)

Add to Collection

Article Environmental Sciences

Tri-CNN: A Three Branch Model for Hyperspectral Image Classification

Mohammed Q. Q. Alkhatib, Mina Al-Saad, Nour Aburaed, Saeed Almansoori, Jaime Zabalza, Stephen Marshall, Hussain Al-Ahmad

Summary: A novel method called Tri-CNN and a three-branch feature fusion approach are proposed to address the issue of insufficient training samples in hyperspectral image (HSI) classification. Experimental results demonstrate that the proposed method exhibits remarkable performance in terms of overall accuracy (OA), average accuracy (AA), and Kappa metrics when compared to existing methods.

REMOTE SENSING (2023)

Add to Collection

Article Nuclear Science & Technology

Automated detection of corrosion in used nuclear fuel dry storage canisters using residual neural networks

Theodore Papamarkou, Hayley Guy, Bryce Kroencke, Jordan Miller, Preston Robinette, Daniel Schultz, Jacob Hinkle, Laura Pullum, Catherine Schuman, Jeremy Renshaw, Stylianos Chatzidakis

Summary: This paper discusses the use of residual neural networks for real-time corrosion detection in nuclear fuel canisters, demonstrating the potential for automating inspections, reducing costs, and minimizing radiation exposure. The proposed approach involves cropping and training the network on images to accurately detect corroded areas and classify images with high precision.

NUCLEAR ENGINEERING AND TECHNOLOGY (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

A multi-task approach for contrastive learning of handwritten signature feature representations

Talles B. Viana, Victor L. F. Souza, Adriano L. I. Oliveira, Rafael M. O. Cruz, Robert Sabourin

Summary: Despite recent advances in computer vision, the problem of offline handwritten signature verification remains challenging. Deep learning methods have been investigated to learn feature representations of handwritten signatures. A multi-task framework based on deep contrastive learning is proposed to improve signature verification by adjusting the feature representations of genuine and skilled forgery signatures.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Add to Collection

Article Computer Science, Information Systems

Convolutional neural networks based potholes detection using thermal imaging

Yukti Aparna, Yukti Bhatia, Rachna Rai, Varun Gupta, Naveen Aggarwal, Aparna Akula

Summary: Potholes on roads are a major cause of accidents and vehicle wear and tear. Current pothole detection techniques have drawbacks, so this study aims to analyze the feasibility and accuracy of thermal imaging for pothole detection. Deep learning using convolutional neural networks approach is adopted, and a comparison between self-built and pre-trained models is conducted. The results show that thermal imaging achieved a highest accuracy of 97.08% with one of the pre-trained models. This study is important for guiding future research in the field of pothole detection.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2022)

Add to Collection

Article Engineering, Electrical & Electronic

DeepDFML-NILM: A New CNN-Based Architecture for Detection, Feature Extraction and Multi-Label Classification in NILM Signals

Lucas da Silva Nolasco, Andre Eugenio Lazzaretti, Bruna Machado Mulinari

Summary: This paper presents an integrated method for handling high-frequency NILM signals, including detection, feature extraction, and classification. The results show that the accuracy of this method is above 90% in most cases, surpassing state-of-the-art approaches, and it also includes a multi-label procedure to increase the recognition of multiple loads.

IEEE SENSORS JOURNAL (2022)

Add to Collection

Article Geochemistry & Geophysics

Rotation-Invariant Feature Learning via Convolutional Neural Network With Cyclic Polar Coordinates Convolutional Layer

Shaohui Mei, Ruoqiao Jiang, Mingyang Ma, Chao Song

Summary: This article proposes a novel cyclic polar coordinate convolutional layer (CPCCL) for CNNs to handle the problem of rotation invariance. The CPCCL converts rotation variation into translation variation using polar coordinates transformation, and employs cyclic convolution to handle the translation variation. Experimental results demonstrate that the proposed CPCCL can effectively handle the rotation-sensitive problem in traditional CNNs and outperforms several state-of-the-art rotation-invariant feature learning algorithms.

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (2023)

Add to Collection

Article Computer Science, Information Systems

Frequency-aware Camouflaged Object Detection

Jiaying Lin, Xin Tan, Ke Xu, Lizhuang Ma, Rynsonw. H. Lau

Summary: This article proposes a frequency-based method called FBNet for camouflaged object detection. The method suppresses confusing high-frequency texture information to separate camouflaged objects from the background. It also includes frequency-aware context aggregation and adaptive frequency attention modules, as well as a gradient-weighted loss function to focus on contour details. Experimental results demonstrate that FBNet outperforms state-of-the-art methods in camouflaged object detection.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Add to Collection

Article Computer Science, Information Systems

Pose- and Attribute-consistent Person Image Synthesis

Cheng Xu, Zejun Chen, Jiajie Mai, Xuemiao Xu, Shengfeng He

Summary: Person Image Synthesis addresses two critical problems in transferring appearance of a source person image to a target pose: synthesis distortion due to pose and appearance entanglement, and failure in preserving original semantics. The proposed PAC-GAN explicitly tackles these problems by using a component-wise transferring model and a high-level semantic constraint. Experimental results on DeepFashion dataset demonstrate the superiority of our method in maintaining pose and attribute consistencies.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Add to Collection

Article Computer Science, Information Systems

Mirror Segmentation via Semantic-aware Contextual Contrasted Feature Learning

Haiyang Mei, Letian Yu, Ke Xu, Yang Wang, Xin Yang, Xiaopeng Wei, Rynson W. H. Lau

Summary: This article introduces a method for segmenting mirrors and proposes a novel network model called MirrorNet+ to address this problem. The authors construct a large-scale mirror segmentation dataset and conduct extensive experiments to validate the effectiveness and generalization capability of the proposed method. The article also discusses applications of mirror segmentation and possible future research directions.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Add to Collection

Article Computer Science, Hardware & Architecture

Edge Distraction-aware Salient Object Detection

Sucheng Ren, Wenxi Liu, Jianbo Jiao, Guoqiang Han, Shengfeng He

Summary: In this study, we propose a new method to generate distraction-free edge features by incorporating holistic interdependencies between high-level features. Experimental results demonstrate that our method outperforms the state-of-the-art methods on benchmark datasets, with fast inference speed on a single GPU.

IEEE MULTIMEDIA (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Rain Removal From Light Field Images With 4D Convolution and Multi-Scale Gaussian Process

Tao Yan, Mingyue Li, Bin Li, Yang Yang, Rynson W. H. Lau

Summary: This research proposes a method for removing rain streaks from light field images by simultaneously processing all sub-views using 4D convolutional layers and detecting rain streaks with a multi-scale self-guided Gaussian process module. Through training on virtual and real-world rainy light field images, accurate detection and removal of rain streaks are achieved, leading to the restoration of rain-free light field images.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Reducing Spatial Labeling Redundancy for Active Semi-Supervised Crowd Counting

Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Dan Xu, Jing Qin, Shengfeng He

Summary: Labeling is challenging for crowd counting, and recent methods have proposed semi-supervised approaches to reduce labeling efforts. However, the None-or-All labeling strategy is suboptimal as it does not consider the diversity of individuals in unlabeled crowd images. In this study, we propose breaking the labeling chain and reducing spatial labeling redundancy to improve semi-supervised crowd counting. We annotate representative regions, analyze region representativeness, and directly supervise unlabeled regions using similarity among individuals. Our experiments show significant performance improvement compared to previous methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Add to Collection

Article Computer Science, Software Engineering

Parsing-Conditioned Anime Translation: A New Dataset and Method

Zhansheng Li, Yangyang Xu, Nanxuan Zhao, Yang Zhou, Yongtuo Liu, Dahua Lin, Shengfeng He

Summary: This study proposes a new anime translation framework by utilizing the prior knowledge of a pre-trained StyleGAN model. The framework incorporates disentangled encoders to separately embed structure and appearance information and includes a FaceBank aggregation method for generating in-domain animes. A new anime portrait parsing dataset, Danbooru-Parsing, is introduced to connect face semantics with appearances, enabling a constrained translation setting. The experiments demonstrate the effectiveness and value of the new dataset and method, providing the first feasible solution for anime translation.

ACM TRANSACTIONS ON GRAPHICS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

DSDNet: Toward single image deraining with self-paced curricular dual stimulations

Yong Du, Junjie Deng, Yulong Zheng, Junyu Dong, Shengfeng He

Summary: The crucial challenge of single image deraining is to remove rain streaks while preserving image details. This paper proposes a novel deep network called DSDNet, which estimates rain streaks and detail loss separately, and predicts a rain mask indicating the location and intensity of rain. Extensive experiments show that the proposed method outperforms state-of-the-art methods and is effective in joint tasks of single image deraining, detection, and segmentation.

COMPUTER VISION AND IMAGE UNDERSTANDING (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Single-View View Synthesis with Self-rectified Pseudo-Stereo

Yang Zhou, Hanjie Wu, Wenxi Liu, Zheng Xiong, Jing Qin, Shengfeng He

Summary: Synthesizing novel views from a single view image is a challenging task, but can be improved by expanding to a multi-view setting. By leveraging stereo prior, a pseudo-stereo viewpoint is generated to assist in 3D reconstruction, making the view synthesis process simpler. A self-rectified stereo synthesis approach is proposed to correct erroneous regions and generate high-quality stereo images.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2023)

Add to Collection

Article Engineering, Electrical & Electronic

Contextual-Assisted Scratched Photo Restoration

Weiwei Cai, Huaidong Zhang, Xuemiao Xu, Shengfeng He, Kun Zhang, Jing Qin

Summary: In this paper, an automatic retouching approach for scratched photographs is proposed, which utilizes scratch and background context for processing in two stages. Experimental results demonstrate that the proposed method outperforms existing methods. Additionally, two new scratched photo datasets are created to promote development in the field.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2023)

Add to Collection

Article Computer Science, Software Engineering

Design Order Guided Visual Note Layout Optimization

Xiaotian Qiao, Ying Cao, Rynson W. H. Lau

Summary: A clear and easy-to-follow layout is important for visual notes. In this article, a novel approach is proposed to automatically optimize the layouts of visual notes by predicting the design order and warping the contents accordingly. The results show that the approach can effectively improve the layout of visual notes for better readability.

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Large-Field Contextual Feature Learning for Glass Detection

Haiyang Mei, Xin Yang, Letian Yu, Qiang Zhang, Xiaopeng Wei, Rynson W. H. Lau

Summary: This paper addresses the important problem of detecting glass surfaces from a single RGB image by proposing a novel glass detection network called GDNet-B. The network explores contextual cues and integrates boundary features to achieve satisfying detection results. The effectiveness and generalization capability of GDNet-B are further validated and its potential applications and future research directions are discussed.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Structure-Informed Shadow Removal Networks

Yuhao Liu, Qing Guo, Lan Fu, Zhanghan Ke, Ke Xu, Wei Feng, Ivor W. Tsang, Rynson W. H. Lau

Summary: In this paper, a novel structure-informed shadow removal network (StructNet) is proposed to address the problem of shadow remnants in existing deep learning-based methods. StructNet reconstructs the structure information of the input image without shadows and uses it to guide the image-level shadow removal. Two main modules, MSFE and MFRA, are developed to extract image structural features and regularize feature consistency. Additionally, an extension called MStructNet is proposed to exploit multi-level structure information and improve shadow removal performance.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2023)

Add to Collection

Proceedings Paper Computer Science, Artificial Intelligence

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation

Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He

Summary: This paper proposes a method to separate domain-invariant crowd and domain-specific background from crowd images, and designs a fine-grained domain adaptation method for crowd counting. By learning crowd segmentation and designing a crowd-aware adaptation mechanism, the method consistently outperforms previous approaches in domain adaptation scenarios.

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation

Xuejian Li, Shiqiang Ma, Junhai Xu, Jijun Tang, Shengfeng He, Fei Guo

Summary: Automatic segmentation of medical images is crucial for disease diagnosis. This paper proposes a dual-path segmentation model called TranSiam for multi-modal medical images. The model utilizes parallel CNNs and a Transformer layer to extract features from different modalities, and aggregates the features using a locality-aware aggregation block.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.