4.7 Article

Rotation-Invariant Image and Video Description With Local Binary Pattern Features

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 21, 期 4, 页码 1465-1477

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2011.2175739

关键词

Classification; dynamic texture; feature; Fourier transform; local binary patterns (LBP); rotation invariance; texture

资金

  1. Academy of Finland
  2. Infotech Oulu
  3. Czech Science Foundation [P103/10/1585]

向作者/读者索取更多资源

In this paper, we propose a novel approach to compute rotation-invariant features from histograms of local noninvariant patterns. We apply this approach to both static and dynamic local binary pattern (LBP) descriptors. For static-texture description, we present LBP histogram Fourier (LBP-HF) features, and for dynamic-texture recognition, we present two rotation-invariant descriptors computed from the LBPs from three orthogonal planes (LBP-TOP) features in the spatiotemporal domain. LBP-HF is a novel rotation-invariant image descriptor computed from discrete Fourier transforms of LBP histograms. The approach can be also generalized to embed any uniform features into this framework, and combining the supplementary information, e. g., sign and magnitude components of the LBP, together can improve the description ability. Moreover, two variants of rotation-invariant descriptors are proposed to the LBP-TOP, which is an effective descriptor for dynamic-texture recognition, as shown by its recent success in different application problems, but it is not rotation invariant. In the experiments, it is shown that the LBP-HF and its extensions outperform noninvariant and earlier versions of the rotation-invariant LBP in the rotation-invariant texture classification. In experiments on two dynamic-texture databases with rotations or view variations, the proposed video features can effectively deal with rotation variations of dynamic textures (DTs). They also are robust with respect to changes in viewpoint, outperforming recent methods proposed for view-invariant recognition of DTs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Cross-Database Micro-Expression Recognition: A Benchmark

Tong Zhang, Yuan Zong, Wenming Zheng, C. L. Philip Chen, Xiaopeng Hong, Chuangao Tang, Zhen Cui, Guoying Zhao

Summary: This paper discusses the challenges and importance of cross-database micro-expression recognition (CDMER) and contributes to this field by establishing an evaluation protocol, conducting benchmark experiments, and proposing a novel DA method.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2022)

Article Computer Science, Artificial Intelligence

Analyzing Group-Level Emotion with Global Alignment Kernel based Approach

Xiaohua Huang, Abhinav Dhall, Roland Goecke, Matti Pietikainen, Guoying Zhao

Summary: This article proposes a new method to effectively analyze group behavior and emotion from a group-level image, using a combination of global alignment kernels and support vector machine. The distance between two group-level images is measured using a global alignment kernel, and a global weight sort scheme is used to optimize the performance of the kernel. Experimental results demonstrate promising performance for group-level emotion recognition.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2022)

Article Engineering, Electrical & Electronic

A Robust GAN-Generated Face Detection Method Based on Dual-Color Spaces and an Improved Xception

Beijing Chen, Xin Liu, Yuhui Zheng, Guoying Zhao, Yun-Qing Shi

Summary: This paper presents experimental findings on detecting post-processed GAN-generated face images and proposes a new method to improve detection performance and robustness.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Article Engineering, Electrical & Electronic

A Local Perturbation Generation Method for GAN-Generated Face Anti-Forensics

Haitao Zhang, Beijing Chen, Jinwei Wang, Guoying Zhao

Summary: In this paper, an effective local perturbation generation method is proposed to expose the vulnerability of state-of-the-art forensic detectors for GAN-generated faces. The method mines the common areas of concern in multiple detectors' decision-making and generates local anti-forensic perturbations using GANs to enhance the visual quality and transferability of anti-forensic faces. Experimental results demonstrate the method's advantage over the state-of-the-art methods in terms of anti-forensic success rate, imperceptibility, and transferability.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2023)

Article Computer Science, Artificial Intelligence

Hyperbolic Deep Neural Networks: A Survey

Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin Shi, Guoying Zhao

Summary: Hyperbolic deep neural networks (HDNNs) have shown superior performance and better physical interpretability in hierarchical structured data, and have been widely applied in different scientific fields. This paper provides a comprehensive review of the neural components in HDNN, demonstrating the potential of extending leading deep approaches to hyperbolic space and applications in various tasks.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Computer Science, Artificial Intelligence

Binaural SoundNet: Predicting Semantics, Depth and Motion With Binaural Sounds

Dengxin Dai, Arun Balajee Vasudevan, Jiri Matas, Luc Van Gool

Summary: This work develops an approach for scene understanding purely based on binaural sounds, which can predict the semantic masks, motion, and depth of sound-making objects. By leveraging cross-modal distillation and spatial sound super-resolution, the performance of auditory perception tasks is significantly improved. Experimental results show good performance in all tasks, mutual benefits between tasks, and importance of microphone quantity and orientation.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Engineering, Electrical & Electronic

Facial Micro-Expressions: An Overview

Guoying Zhao, Xiaobai Li, Yante Li, Matti Pietikainen

Summary: Micro-expression (ME) is an involuntary, fleeting, and subtle facial expression that can provide essential clues to people's true feelings. In recent years, ME analysis, especially automatic ME analysis in computer vision, has gained much attention due to its practical importance. This survey provides a comprehensive review of ME development in the field of computer vision, discussing various computational ME analysis methods and future directions.

PROCEEDINGS OF THE IEEE (2023)

Article Computer Science, Artificial Intelligence

Visual Object Tracking With Discriminative Filters and Siamese Networks: A Survey and Outlook

Sajid Javed, Martin Danelljan, Fahad Shahbaz Khan, Muhammad Haris Khan, Michael Felsberg, Jiri Matas

Summary: Accurate and robust visual object tracking is a challenging problem in computer vision. This survey reviews more than 90 Discriminative Correlation Filters (DCFs) and Siamese trackers, based on results in nine tracking benchmarks. It presents the background theory, research challenges, and performance analysis of both DCFs and Siamese trackers, and provides recommendations for future research.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Deep Learning for Face Anti-Spoofing: A Survey

Zitong Yu, Yunxiao Qin, Xiaobai Li, Chenxu Zhao, Zhen Lei, Guoying Zhao

Summary: This paper presents the first comprehensive review of recent advances in deep learning based face anti-spoofing (FAS), including pixel-wise supervision, domain generalization, and multi-modal sensors. It aims to stimulate future research in the field.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Engineering, Electrical & Electronic

End-to-End Dual-Branch Network Towards Synthetic Speech Detection

Kaijie Ma, Yifan Feng, Beijing Chen, Guoying Zhao

Summary: Synthetic speech attacks pose a great threat to ASV systems. A Dual-Branch Network is proposed, using LFCC and CQT as inputs, to enhance the generalization ability for attacks generated by unknown synthesis algorithms. The system outperforms existing state-of-the-art systems and shows good generalization for unknown forgery types.

IEEE SIGNAL PROCESSING LETTERS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

FEAR: Fast, Efficient, Accurate and Robust Visual Tracker

Vasyl Borsuk, Roman Vei, Orest Kupyn, Tetiana Martyniuk, Igor Krashenyi, Jiri Matas

Summary: We introduce FEAR, a family of efficient Siamese visual trackers that achieve high accuracy and robustness. By incorporating dual-template representation and pixel-wise fusion block, FEAR trackers outperform most Siamese trackers in terms of accuracy and efficiency. The optimized version, FEAR-XS, offers significantly faster tracking while maintaining near state-of-the-art results.

COMPUTER VISION, ECCV 2022, PT XXII (2022)

Proceedings Paper Computer Science, Artificial Intelligence

DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image

Tetiana Martyniuk, Orest Kupyn, Yana Kurlyak, Igor Krashenyi, Jiri Matas, Viktoriia Sharmanska

Summary: This paper presents a dense and diverse large-scale dataset, DAD-3DHeads, as well as a robust model for 3D Dense Head Alignment in-the-wild. The dataset contains annotations of over 3.5K landmarks that accurately represent 3D head shape. The data-driven model, DAD-3DNet, learns shape, expression, and pose parameters, and performs 3D reconstruction of a FLAME mesh.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Point Cloud Color Constancy

Xiaoyan Xing, Yanlin Qian, Sibo Feng, Yuhan Dong, Jiri Matas

Summary: In this paper, we propose PCCC, an algorithm for illumination chromaticity estimation using point clouds. By leveraging depth information from a ToF sensor and RGB intensities, PCCC applies the PointNet architecture to derive the illumination vector and make a global decision about the global illumination chromaticity. PCCC outperforms state-of-the-art algorithms on popular RGB-D datasets and a novel benchmark, with a simple and fast method that requires a small input size.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Recall@k Surrogate Loss with Large Batches and Similarity Mixup

Yash Patel, Giorgos Tolias, Jiri Matas

Summary: This work focuses on learning deep visual representation models for retrieval by exploring the interplay between a new loss function, the batch size, and a new regularization approach. The suggested method achieves state-of-the-art performance in several image retrieval benchmarks when used for deep metric learning.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2022)

Article Computer Science, Artificial Intelligence

Leaders and Followers Identified by Emotional Mimicry During Collaborative Learning: A Facial Expression Recognition Study on Emotional Valence

Muhterem Dindar, Sanna Jarvela, Sara Ahola, Xiaohua Huang, Guoying Zhao

Summary: This article explores the potential of emotional mimicry in identifying leader and follower students in collaborative learning settings. The findings suggest that video-based facial emotions recognition combined with cross-recurrence quantification analysis can accurately identify leaders and followers. This research highlights the importance of using these methods in collaborative learning research and their ability to explain social and affective dynamics within the setting.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2022)

暂无数据