☆ 4.6 Article

A New Ratio Mask Representation for CASA-Based Speech Enhancement

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2019)

Journal

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

Volume 27, Issue 1, Pages 7-19

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TASLP.2018.2868407

Keywords

Computational auditory scene analysis (CASA); ideal ratio mask (IRM); deep neural networks (DNN); speech enhancement

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In the computational auditory scene analysis method, the ideal ratio mask or alternatively the ideal binary mask is the key point to reconstruct the enhanced signal. The ratio mask in its Wiener filtering or its square root form is currently considered. However, this kind of ratio mask overlooked one important issue. It does not exploit the inter-channel correlation (ICC) in the noisy speech, noise, and clean speech spectra. Thus, in this paper, we first propose a novel ratio mask representation by utilizing the ICC. In this way, we adaptively reallocate the power ratio of the speech and noise during the construction of ratio mask, thus, more speech and noise components are retained and masked at the same time, respectively. Second, the channel-weight contour based on the equal loudness hearing attribute is adopted to revise this new ratio mask in each Gammatone filterbank channel. Finally, the revised ratio mask is effectively used to train a five-layer structured deep neural network. Experiments show that the proposed ratio mask performs better than the conventional ratio mask representation and other series of enhancement algorithms in terms of speech quality, intelligibility, and spectral distortion under different signal to noise ratio conditions using six types of noises.

A New Ratio Mask Representation for CASA-Based Speech Enhancement

Journal

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A New Ratio Mask Representation for CASA-Based Speech Enhancement

Journal

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper