Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper proposes a simple, computationally efficient 2-mixture model approach to discriminate between speech and background noise at the magnitude spectrogram level. It is directly derived from observations on real data, and can be used in a fully unsupervised manner, with the EM algorithm. In this paper, the 2-mixture model is used in an ``Unsupervised Spectral Subtraction'' scheme that can be applied as a pre-processing step for any acoustic feature extraction scheme, such as MFCCs or PLP. The goal is to improve noise-robustness of the acoustic features. Experimental results on both OGI Numbers 95 and Aurora 2 tasks yielded a major improvement on all noise conditions, while retaining a similar performance on clean conditions.
Jean-Philippe Lucien Montillet, Feng Zhou
Sabine Süsstrunk, Majed El Helou
Sabine Süsstrunk, Majed El Helou