☆ 4.5 Article

Deep spatial-temporal feature fusion for facial expression recognition in static images

PATTERN RECOGNITION LETTERS (2019)

期刊

PATTERN RECOGNITION LETTERS

卷 119, 期 -, 页码 49-61

出版社

ELSEVIER

DOI: 10.1016/j.patrec.2017.10.022

关键词

Facial expression recognition; Deep neural network; Optical flow; Spatial-temporal feature fusion; Transfer learning

类别

Computer Science, Artificial Intelligence

资金

National Nature Science Foundation of China [61471206, 61401220]
Natural Science Foundation of Jiangsu province [BK20141428, BK20140884]
Science Foundation of Ministry of Education-China Mobile Communications Corporation [MCM20150504]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Traditional methods of performing facial expression recognition commonly use hand-crafted spatial features. This paper proposes a multi-channel deep neural network that learns and fuses the spatial-temporal features for recognizing facial expressions in static images. The essential idea of this method is to extract optical flow from the changes between the peak expression face image (emotional-face) and the neutral face image (neutral-face) as the temporal information of a certain facial expression, and use the gray-level image of emotional-face as the spatial information. A Multi-channel Deep Spatial-Temporal feature Fusion neural Network (MDSTFN) is presented to perform the deep spatial-temporal feature extraction and fusion from static images. Each channel of the proposed method is fine-tuned from a pretrained deep convolutional neural networks (CNN) instead of training a new CNN from scratch. In addition, average-face is used as a substitute for neutral-face in real-world applications. Extensive experiments are conducted to evaluate the proposed method on benchmarks databases including CK+, MMI, and RaFD. The results show that the optical flow information from emotional-face and neutral-face is a useful complement to spatial feature and can effectively improve the performance of facial expression recognition from static images. Compared with state-of-the-art methods, the proposed method can achieve better recognition accuracy, with rates of 98.38% on the CK+ database, 99.17% on the RaFD database, and 99.59% on the MMI database, respectively. (C) 2017 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.5

评分不足

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

Spatial-Temporal Attention Network for Depression Recognition from facial videos

Yuchen Pan, Yuanyuan Shang, Tie Liu, Zhuhong Shao, Guodong Guo, Hui Ding, Qiang Hu

Summary: This paper proposes a novel Spatial-Temporal Attention Depression Recognition Network (STA-DRN) that enhances feature extraction and relevance of depression recognition by capturing global and local spatial-temporal information. The experimental results demonstrate competitive performance and visualization analysis shows significant responses in specific locations related to depression.

EXPERT SYSTEMS WITH APPLICATIONS (2024)