4.5 Article

Deep spatial-temporal feature fusion for facial expression recognition in static images

期刊

PATTERN RECOGNITION LETTERS
卷 119, 期 -, 页码 49-61

出版社

ELSEVIER
DOI: 10.1016/j.patrec.2017.10.022

关键词

Facial expression recognition; Deep neural network; Optical flow; Spatial-temporal feature fusion; Transfer learning

资金

  1. National Nature Science Foundation of China [61471206, 61401220]
  2. Natural Science Foundation of Jiangsu province [BK20141428, BK20140884]
  3. Science Foundation of Ministry of Education-China Mobile Communications Corporation [MCM20150504]

向作者/读者索取更多资源

Traditional methods of performing facial expression recognition commonly use hand-crafted spatial features. This paper proposes a multi-channel deep neural network that learns and fuses the spatial-temporal features for recognizing facial expressions in static images. The essential idea of this method is to extract optical flow from the changes between the peak expression face image (emotional-face) and the neutral face image (neutral-face) as the temporal information of a certain facial expression, and use the gray-level image of emotional-face as the spatial information. A Multi-channel Deep Spatial-Temporal feature Fusion neural Network (MDSTFN) is presented to perform the deep spatial-temporal feature extraction and fusion from static images. Each channel of the proposed method is fine-tuned from a pretrained deep convolutional neural networks (CNN) instead of training a new CNN from scratch. In addition, average-face is used as a substitute for neutral-face in real-world applications. Extensive experiments are conducted to evaluate the proposed method on benchmarks databases including CK+, MMI, and RaFD. The results show that the optical flow information from emotional-face and neutral-face is a useful complement to spatial feature and can effectively improve the performance of facial expression recognition from static images. Compared with state-of-the-art methods, the proposed method can achieve better recognition accuracy, with rates of 98.38% on the CK+ database, 99.17% on the RaFD database, and 99.59% on the MMI database, respectively. (C) 2017 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Hardware & Architecture

Video action recognition with visual privacy protection based on compressed sensing

Jixin Liu, Ruxue Zhang, Guang Han, Ning Sun, Sam Kwong

Summary: This paper proposes a method for video action recognition that protects visual privacy using compressed sensing, while balancing operational efficiency with recognition accuracy. The method utilizes a convolutional 3D network model and PCA to reduce temporal complexity, and integrates a sparse representation-based classification algorithm to improve recognition performance. Experiments show the method's robustness in video action recognition tasks and its ability to adequately protect visual privacy.

JOURNAL OF SYSTEMS ARCHITECTURE (2021)

Article Engineering, Electrical & Electronic

Dual attention and part drop network for person reidentification

Guang Han, Yuechuan Ai, Jixin Liu, Ning Sun, Guangwei Gao

Summary: The DAPD-Net proposed in this study utilizes dual attention and part drop modules to enhance person reidentification, improve network performance, and increase resilience to occlusion.

JOURNAL OF ELECTRONIC IMAGING (2021)

Article Computer Science, Artificial Intelligence

Multi-stream slowFast graph convolutional networks for skeleton-based action recognition

Ning Sun, Ling Leng, Jixin Liu, Guang Han

Summary: A SlowFast graph convolution network (SF-GCN) is proposed for improved spatial-temporal feature extraction from skeleton sequence, utilizing the architecture of SlowFast network in the GCN model. SF-GCN consists of Fast and Slow pathways to extract features of fast and slow temporal changes, respectively, which are fused and weighted using lateral connection and channel attention. This design enhances feature extraction ability while reducing computational costs significantly.

IMAGE AND VISION COMPUTING (2021)

Article Radiology, Nuclear Medicine & Medical Imaging

Deep Learning for Detection of Intracranial Aneurysms from Computed Tomography Angiography Images

Xiujuan Liu, Jun Mao, Ning Sun, Xiangrong Yu, Lei Chai, Ye Tian, Jianming Wang, Jianchao Liang, Haiquan Tao, Lihua Yuan, Jiaming Lu, Yang Wang, Bing Zhang, Kaihua Wu, Yiding Wang, Mengjiao Chen, Zhishun Wang, Ligong Lu

Summary: This study developed a new method using deep learning to automatically detect intracranial aneurysms from CTA images. The performance of the method was evaluated, showing reliable segmentation and detection of intracranial aneurysms, with a sensitivity of 100% for large and medium-sized aneurysms.

JOURNAL OF DIGITAL IMAGING (2023)

Article Computer Science, Hardware & Architecture

Appearance and geometry transformer for facial expression recognition in the wild

Ning Sun, Yao Song, Jixin Liu, Lei Chai, Haian Sun

Summary: In this paper, a model called the appearance and geometry transformer (AGT) is proposed to improve the accuracy of facial expression recognition (FER) in the wild. The AGT performs feature extraction and fusion on heterogeneous data using two transformer pathways. It achieves comparable results to state-of-the-art methods on benchmark databases FERplus and RAF-DB.

COMPUTERS & ELECTRICAL ENGINEERING (2023)

Article Computer Science, Hardware & Architecture

Combined CNN/RNN video privacy protection evaluation method for monitoring home scene violence

Jixin Liu, Pengcheng Dai, Guang Han, Ning Sun

Summary: Rapid technological advancements have led to an increase in the number of video surveillance devices in homes. This has prompted the development of various methods for video privacy protection. This paper proposes a method for evaluating the level of privacy protection in multilayer compressed sensing videos. By using a combination of CNN and RNN convolutional networks, the proposed approach achieves better prediction and generalization performance compared to previous methods. Additionally, an association model is established between visual privacy protection score and practicability score, allowing for practical applications and evaluation of other video privacy protection methods.

COMPUTERS & ELECTRICAL ENGINEERING (2023)

Article Computer Science, Artificial Intelligence

3-D Facial Feature Reconstruction and Learning Network for Facial Expression Recognition in the Wild

Ning Sun, Jianglong Tao, Jixin Liu, Haian Sun, Guang Han

Summary: In this article, a novel end-to-end trainable 3-D face feature reconstruction and learning network (3-DF-RLN) is proposed to improve the performance of facial expression recognition (FER) in the wild. Through 3-D face reconstruction, both the missing facial information and accurate facial geometric information can be effectively obtained. The proposed 3-DF-RLN model achieves FER by fusing apparent features from 2-D face images and geometric features from 3-D facial landmarks. Experimental results on benchmark databases demonstrate the superior FER performance of the proposed method, and the face graph from the geometry pathway reveals the correlations between facial landmarks in FER.

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Unsupervised Cross-View Facial Expression Image Generation and Recognition

Ning Sun, Qingyi Lu, Wenming Zheng, Jixin Liu, Guang Han

Summary: We propose an unsupervised cross-view facial expression adaptation network (UCFEAN) that can generate and recognize cross-view facial expressions in images in an unsupervised manner. UCFEAN converts the unsupervised domain adaptation between two image spaces into semi-supervised learning in feature spaces. It uses a generative adversarial network to perform cyclic image generation and project unlabelled target images and labelled source images to the corresponding feature spaces. The proposed method achieves realistic target image generation and high precision recognition of cross-view facial expressions.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2023)

Article Radiology, Nuclear Medicine & Medical Imaging

Comparison Between the Stereoscopic Virtual Reality Display System and Conventional Computed Tomography Workstation in the Diagnosis and Characterization of Cerebral Arteriovenous Malformations

Xiujuan Liu, Jun Mao, Ning Sun, Xiangrong Yu, Lei Chai, Ye Tian, Jianming Wang, Jianchao Liang, Haiquan Tao, Zhishun Wang, Ligong Lu

Summary: This study aimed to evaluate the ability of the stereoscopic virtual reality display system (SVRDS) in displaying the angioarchitecture of cerebral arteriovenous malformations (CAVMs) by comparing its accuracy with that of the conventional computed tomography workstation (CCTW). Retrospective analysis of computed tomography angiography images was performed on 19 patients with confirmed CAVM, and the angioarchitectural parameters were recorded and compared between SVRDS and CCTW. SVRDS showed advantages in displaying the blood vessels of CAVMs compared to CCTW, and it provided a more intuitive visualization of the overall spatial structure.

JOURNAL OF DIGITAL IMAGING (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Multi-modal Scene Recognition Based on Global Self-attention Mechanism

Xiang Li, Ning Sun, Jixin Liu, Lei Chai, Haian Sun

Summary: This paper proposes an end-to-end trainable network model MSR-Trans based on the global self-attention mechanism for multi-modal scene recognition. The model utilizes two transformer-based branches to extract features from RGB image and depth data, and then uses a fusion layer to fuse these features for final scene recognition. Lateral connections are added on some layers between the two branches to explore the relationship between multi-modal information, and a dropout layer is embedded in the transformer block to prevent overfitting. Extensive experiments on SUN RGB-D and NYUD2 datasets show that the proposed method achieves recognition accuracies of 69.0% and 74.1% for multi-modal scene recognition, respectively.

ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022 (2023)

Article Computer Science, Hardware & Architecture

Privacy-Preserving Video Fall Detection via Chaotic Compressed Sensing and GAN-Based Feature Enhancement

Jixin Liu, Ru Meng, Ning Sun, Guang Han, Sam Kwong

Summary: This study proposes a computer vision fall detection method to protect video privacy. By utilizing compressed sensing visual privacy protection and GAN-based feature enhancement, the method can effectively detect fall behavior with high accuracy.

IEEE MULTIMEDIA (2022)

Article Computer Science, Information Systems

Privacy-Preserving In-Home Fall Detection Using Visual Shielding Sensing and Private Information-Embedding

Jixin Liu, Rong Tan, Guang Han, Ning Sun, Sam Kwong

Summary: The study proposes a fall detection system with visual shielding to ensure the safety of elderly people at home while preserving their personal privacy. Through multilayer compressed sensing and feature extraction, accurate identification of fall behaviors is achieved.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Fall Detection under Privacy Protection Using Multi-layer Compressed Sensing

Ji-xin Liu, Rong Tan, Ning Sun, Guang Han, Xiao-fei Li

2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020) (2020)

Article Engineering, Electrical & Electronic

Visual privacy-preserving level evaluation for multilayer compressed sensing model using contrast and salient structural features

Jixin Liu, Zheng Tang, Ning Sun, Guang Han, Sam Kwong

SIGNAL PROCESSING-IMAGE COMMUNICATION (2020)

Article Computer Science, Information Systems

Multi-Target Tracking Based on High-Order Appearance Feature Fusion

Guang Han, Yan Gao, Ning Sun

IEEE ACCESS (2019)

暂无数据