4.7 Article

Harmony Potentials

期刊

INTERNATIONAL JOURNAL OF COMPUTER VISION
卷 96, 期 1, 页码 83-102

出版社

SPRINGER
DOI: 10.1007/s11263-011-0449-8

关键词

Semantic object segmentation; Hierarchical conditional random fields

资金

  1. EU [ERGTS-VICI-224737, VIDI-VIDEO IST-045547, FP7-ICT-24314, FP7-ICT-248873]
  2. Spanish Research Program Consolider-Ingenio: MIPRCV [CSD2007-00018]
  3. Spanish projects [TIN2009-14501-C02-02, TIN2009-14173, TRA2010-21371-C03-01]
  4. Ramon y Cajal fellowship
  5. FPU [AP2008-03378]

向作者/读者索取更多资源

The Hierarchical Conditional Random Field (HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales. At higher scales in the image, this representation yields an oversimplified model since multiple classes can be reasonably expected to appear within large regions. This simplified model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combination of labels, penalizing only unlikely combinations of classes. We also propose an effective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Automation & Control Systems

Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification

Pau Rodriguez, Guillem Cucurull, Jordi Gonzalez, Josep M. Gonfaus, Kamal Nasrollahi, Thomas B. Moeslund, F. Xavier Roca

Summary: This paper proposes an automatic system for pain assessment, which outperforms the latest techniques by feeding the raw frames to deep learning models and considering the temporal relation and whole image. The research achieves competitive results in the UNBC-McMaster Shoulder Pain Expression Archive Database and the Cohn Kanade+ facial expression database.

IEEE TRANSACTIONS ON CYBERNETICS (2022)

Article Computer Science, Artificial Intelligence

End-to-end global to local convolutional neural network learning for hand pose recovery in depth data

Meysam Madadi, Sergio Escalera, Xavier Baro, Jordi Gonzalez

Summary: This study introduces a novel hierarchical tree-like structured CNN to address the 3D pose estimation of human hands, training branches to specialize in local poses and fusing features to learn higher order dependencies among joints. Furthermore, a non-rigid data augmentation approach is employed to increase training depth data. Experimental results show competitive performance on various datasets.

IET COMPUTER VISION (2022)

Article Computer Science, Artificial Intelligence

Image rain removal and illumination enhancement done in one go

Yecong Wan, Yuanshuo Cheng, Mingwen Shao, Jordi Gonzalez

Summary: In this paper, a novel spatially-adaptive network SANet is proposed for simultaneous rain removal and illumination enhancement. A contrastive loss and a new synthetic dataset DarkRain are introduced to boost the development of rain image restoration algorithms.

KNOWLEDGE-BASED SYSTEMS (2022)

Article Computer Science, Information Systems

Main product detection with graph networks for fashion

Vacit Oguz Yazici, Longlong Yu, Arnau Ramisa, Luis Herranz, Joost van de Weijer

Summary: Computer vision has made progress in the online fashion retail industry by proposing a model that utilizes Graph Convolutional Networks (GCN) to detect fashion products in boundary boxes. Compared to the state-of-the-art approach, this method performs better in scenarios where title-input is missing and during cross-dataset evaluation.

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Article Computer Science, Artificial Intelligence

Single image super-resolution based on directional variance attention network

Parichehr Behjati, Pau Rodriguez, Carles Fernandez, Isabelle Hupont, Armin Mehri, Jordi Gonzalez

Summary: This study proposes a computationally efficient and accurate single image super-resolution network called DiVANet. By introducing a directional variance attention mechanism and a residual attention feature group, the network is able to improve the performance and efficiency of image recovery.

PATTERN RECOGNITION (2023)

Article Computer Science, Artificial Intelligence

Self-Training for Class-Incremental Semantic Segmentation

Lu Yu, Xialei Liu, Joost van de Weijer

Summary: This paper addresses the problem of catastrophic forgetting in deep neural networks during incremental learning in class-incremental semantic segmentation. A self-training approach is proposed, leveraging unlabeled data for rehearsal of previous knowledge. Experimental results show that maximizing self-entropy and using diverse auxiliary data can significantly improve performance. State-of-the-art results are achieved on Pascal-VOC 2012 and ADE20K datasets.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Information Systems

A Spatio-Temporal Spotting Network with Sliding Windows for Micro-Expression Detection

Wenwen Fu, Zhihong An, Wendong Huang, Haoran Sun, Wenjuan Gong, Jordi Gonzalez

Summary: This study investigates the problem of micro-expression spotting as a frame-by-frame micro-expression classification problem and proposes an effective spotting model. The experimental results demonstrate that the proposed method outperforms the state-of-the-art method in terms of overall F-scores on the CAS(ME)2 and SAMM Long Videos databases.

ELECTRONICS (2023)

Article Computer Science, Artificial Intelligence

Class-Incremental Learning: Survey and Performance Evaluation on Image Classification

Marc Masana, Xialei Liu, Bartlomiej Twardowski, Mikel Menta, Andrew D. Bagdanov, Joost van de Weijer

Summary: For future learning systems, incremental learning is desirable due to its efficient resource usage, reduced memory usage, and resemblance to human learning. The main challenge for incremental learning is catastrophic forgetting. This paper provides a comprehensive survey of existing class-incremental learning methods for image classification and performs extensive experimental evaluations on thirteen methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Proceedings Paper Computer Science, Artificial Intelligence

3D-Aware Multi-Class Image-to-Image Translation with NeRFs

Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

Summary: Recent advances in 3D-aware generative models combined with Neural Radiance Fields have achieved impressive results in 3D consistent multi-class image-to-image translation. To address the unrealistic shape/identity change in 2D-I2I translation, the learning process is divided into a multi-class 3D-aware GAN step and a 3D-aware I2I translation step, with novel techniques proposed to reduce view-consistency problems.

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2023)

Proceedings Paper Computer Science, Artificial Intelligence

MVMO: A MULTI-OBJECT DATASET FOR WIDE BASELINE MULTI-VIEW SEMANTIC SEGMENTATION

Aitor Alvarez-Gila, Joost van de Weijer, Yaxing Wang, Estibaliz Garrote

Summary: MVMO is a synthetic dataset with high object density and wide camera baselines, enabling research in multi-view semantic segmentation and cross-view semantic transfer. New research is needed to utilize the information from multi-view setups effectively.

2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Visual Transformers with Primal Object Queries for Multi-Label Image Classification

Vacit Oguz Yazici, Joost Van De Weijer, Longlong Yu

Summary: This paper investigates the problem of multi-label image classification and proposes an enhanced transformer model that utilizes primal object queries to improve model performance and convergence speed.

2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) (2022)

Proceedings Paper Computer Science, Theory & Methods

Transferring Unconditional to Conditional GANs with Hyper-Modulation

Hector Laria, Yaxing Wang, Joost van de Weijer, Bogdan Raducanu

Summary: GANs have matured in recent years and can generate high-resolution, realistic images. This paper focuses on transferring from high-quality pretrained unconditional GANs to conditional GANs, proposing hyper-modulated generative networks for architectural adaptation and introducing self-initialization and contrastive loss for improved transfer efficiency.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 (2022)

Proceedings Paper Computer Science, Theory & Methods

Area Under the ROC Curve Maximization for Metric Learning

Bojana Gajic, Ariel Amato, Ramon Baldrich, Joost van de Weijer, Carlo Gatta

Summary: Most popular metric learning losses are not directly related to the evaluation metrics used to assess their performance. However, training a metric learning model by maximizing the area under the ROC curve can induce a suitable implicit ranking for retrieval problems. By proposing an approximated and derivable AUC loss, state-of-the-art performance is achieved on large scale retrieval benchmark datasets.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Class-Balanced Active Learning for Image Classification

Javad Zolfaghari Bengar, Joost van de Weijer, Laura Lopez Fuentes, Bogdan Raducanu

Summary: In real-world scenarios, imbalanced class distribution in datasets further complicates the active learning process. To address this issue, we propose an optimization framework considering class-balancing, which can effectively improve the performance of active learning methods.

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022) (2022)

Article Computer Science, Information Systems

Frequency-Based Enhancement Network for Efficient Super-Resolution

Parichehr Behjati, Pau Rodriguez, Carles Fernandez Tena, Armin Mehri, F. Xavier Roca, Seiichi Ozawa, Jordi Gonzalez

Summary: This study focuses on single image super-resolution based on deep convolutional neural networks (CNNs), proposing a novel Frequency-based Enhancement Block (FEB) to enhance high-frequency information and recover finer details. Experimental results show that replacing commonly used SR blocks with FEB improves reconstruction error and reduces the number of parameters in the model.

IEEE ACCESS (2022)

暂无数据