4.2 Article

Towards end-to-end speech recognition with transfer learning

Publisher

SPRINGER
DOI: 10.1186/s13636-018-0141-9

Keywords

Speech recognition; End-to-end; Transfer learning

Funding

  1. National Natural Science Foundation of China [61673395, 61403415]
  2. Natural Science Foundation of Henan Province [162300410331]

Ask authors/readers for more resources

A transfer learning-based end-to-end speech recognition approach is presented in two levels in our framework. Firstly, a feature extraction approach combining multilingual deep neural network (DNN) training with matrix factorization algorithm is introduced to extract high-level features. Secondly, the advantage of connectionist temporal classification (CTC) is transferred to the target attention-based model through a joint CTC-attention model composed of shallow recurrent neural networks (RNNs) on top of the proposed features. The experimental results show that the proposed transfer learning approach achieved the best performance among all end-to-end methods and could be comparable to the state-of-the-art speech recognition system for TIMIT when further jointly decoded with a RNN language model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available