☆ 4.5 Article

Speech emotion recognition model based on Bi-GRU and Focal Loss

PATTERN RECOGNITION LETTERS (2020)

期刊

PATTERN RECOGNITION LETTERS

卷 140, 期 -, 页码 358-365

出版社

ELSEVIER

DOI: 10.1016/j.patrec.2020.11.009

关键词

Bi-GRU; Focal loss; Speech emotion recognition; Deep learning; CRNN

类别

Computer Science, Artificial Intelligence

资金

Special Projects in Key Areas (New Generation of Information Technology) of Colleges and Universities in Guangdong Province [2020ZDZX3046]
Characteristics Innovation Project of Colleges and Universities of Guangdong Province [2019KTSCX235, 2019KTSCX234]
Higher Education of the Ministry of Education of the People's Republic of China [201901070016]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

For the problems of inconsistent sample duration and unbalance of sample categories in the speech emotion corpus, this paper proposes a speech emotion recognition model based on Bi-GRU (Bidirection Gated Recurrent Unit) and Focal Loss. The model has been improved on the basis of learning CRNN (Convolutional Recurrent Neural Network) deeply. In CRNN, Bi-GRU is used to effectively lengthen the samples of the speech with short duration, and Focal Loss function is used to deal with the difficulties in classification caused by the imbalance of emotional categories of the samples. Through different methods for experimental comparison, weighted average recall (WAR), unweighted average recall (UAR) and confusion matrix (CM) are used as evaluation index of the algorithm. The experimental results show that the speech emotion recognition model proposed in this paper improves the recognition accuracy and the imbalance of IEMOCAP database samples, and can effectively prove that the improvement of speech emotion recognition performance is not due to the adjustment of model parameters or the change of the model topology. (c) 2020 Elsevier B.V. All rights reserved.

Speech emotion recognition model based on Bi-GRU and Focal Loss

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Speech emotion recognition model based on Bi-GRU and Focal Loss

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文