Journal
SPEECH COMMUNICATION
Volume 50, Issue 3, Pages 215-227Publisher
ELSEVIER
DOI: 10.1016/j.specom.2007.09.001
Keywords
articulatory-to-acoustic mapping; acoustic-to-articulatory inversion mapping; GMM; MMSE; dynamic features
Ask authors/readers for more resources
In this paper, we describe a statistical approach to both an articulatory-to-acoustic mapping and an acoustic-to-articulatory inversion mapping without using phonetic information. The joint probability density of an articulatory parameter and an acoustic parameter is modeled using a Gaussian mixture model (GMM) based on a parallel acoustic-articulatory speech database. We apply the GMM-based mapping using the minimum mean-square error (MMSE) criterion, which has been proposed for voice conversion, to the two mappings. Moreover, to improve the mapping performance, we apply maximum likelihood estimation (MLE) to the GMM-based mapping method. The determination of a target parameter trajectory having appropriate static and dynamic properties is obtained by imposing an explicit relationship between static and dynamic features in the MLE-based mapping. Experimental results demonstrate that the MLE-based mapping with dynamic features can significantly improve the mapping performance compared with the MMSE-based mapping in both the articulatory-to-acoustic mapping and the inversion mapping. (c) 2007 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available