4.6 Article

Moving average multi directional local features for speaker recognition

Publisher

SPRINGER
DOI: 10.1007/s10586-018-2030-5

Keywords

Speaker recognition; MDLF; MA-MDLF; Local feature; MFCC; LPCC; RASTA-PLP

Funding

  1. Deanship of Scientific Research at King Saud University [RG-1438-071]

Ask authors/readers for more resources

A new speech feature extraction technique called moving average multi directional local features (MA-MDLF) is presented in this paper. This method is based on linear regression (LR) and moving average (MA) in the time-frequency plane. Three-point LR is taken along time axis and frequency axis, and 3 points MA is taken along 45 degrees and 135 degrees in the time-frequency plane. The LR captures the voice onset\offset, formant contour, while the moving average captures the dynamics on time-frequency axes which can be seen as voiceprints. The MA-MDLF performance is compared to commonly used speech features in speaker recognition. The comparison is performed in a speaker recognition system (SRS) for three different conditions, namely clean speech, mobile speech, and cross channel. MA-MDLF has shown better performance than the baseline MFCC, RASTA-PLP and LPCC. In clean and mobile speech, MA-MDLF feature performs the best and also in the cross channel task MA-MDLF performed excellent. We also evaluated the MA-MDLF using three speech databases, namely KSU, LDC Babylon and TIMITdatabases, and found that MA-MDLF outperformed the other commonly used features with speech from all the three databases. The first and second databases are for Arabic speech while third is for English speech.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available