4.5 Article

Multilingual speech recognition for GlobalPhone languages

Journal

SPEECH COMMUNICATION
Volume 140, Issue -, Pages 71-86

Publisher

ELSEVIER
DOI: 10.1016/j.specom.2022.03.006

Keywords

Multilingual ASR; GlobalPhone; Multilingual mix; Transfer and multitask learning

Funding

  1. University of Bremen
  2. George Forster Fellowship
  3. Alexander von Humboldt Foundation

Ask authors/readers for more resources

This paper presents the research on the development of multilingual automatic speech recognition (ML ASR) systems using the GlobalPhone database. It analyzes the phonetic overlap and morphological complexity of different languages, and develops ML ASR systems using deep neural network based training approaches. The results show that languages with small amounts of monolingual training data benefit greatly from ML training, and training with phonetically related languages is more beneficial than using less related languages.
In this paper, we present our investigations towards the development of multilingual automatic speech recognition (ML ASR) systems using the GlobalPhone database. In addition to GlobalPhone, we have included 4 Ethiopian languages (Amharic, Oromo, Tigrigna and Wolaytta), as well as Uyghur and English in our investigation. In order to see the impact of language relatedness in ML ASR training, we have analyzed both phonetic overlap and morphological complexity of the languages. Deep Neural Network based ML ASR systems have been developed using ML mix, transfer and multitask learning approaches. Relative word error rate (WER) reductions up to 33.21% have been achieved as a result of using resources of other languages in ML acoustic model training. Our experimental results show that languages with small amounts of monolingual training data benefit a lot from ML training. Moreover, using phonetically related languages in ML training is more beneficiary than using phonetically less related languages. It seems that the nature of the corpus (single or mixed domain, noisy or noise free, etc.) has also an impact in ML training although it is not as important as the phonetic relatedness of the languages.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available