Journal
MULTIMEDIA TOOLS AND APPLICATIONS
Volume 75, Issue 9, Pages 5109-5124Publisher
SPRINGER
DOI: 10.1007/s11042-015-2935-4
Keywords
Accent recognition; GMM; DNN; Distant-talking speech; Machine learning
Categories
Funding
- JSPS KANKENHI [15K16020]
- Telecommunications Advancement Foundation (TAF), Japan
- Grants-in-Aid for Scientific Research [15K16020] Funding Source: KAKEN
Ask authors/readers for more resources
Recently, automatic accent recognition has been paid more and more attentions. However, there are few researches focusing on accent recognition in distant-talking environment which is very important for improving distant-talking speech recognition performance with non-native accents. In this paper, we apply Gaussian Mixture Models (GMM) and Deep Neural Network (DNN) to identify the speaker accent in reverberant environments. The combination of likelihood with these two approaches is also proposed. In reverberant environment, the accent recognition rate was improved from 90.7 % with GMM to 93.0 % with DNN. The combination of GMM and DNN achieved recognition rate of 97.5 %, which outperformed than the individual GMM and DNN because the complementation of GMM and DNN. The relative error reduction is 73.1 % than the GMM-based method and 64.3 % than the DNN-based method, respectively.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available