Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In this paper, we introduce a novel algorithm to perform multi-scale Fourier transform analysis of piecewise stationary signals with application to automatic speech recognition. Such signals are composed of quasi-stationary segments of variable lengths. Therefore, in the proposed algorithm, signals are analyzed with multiple-sized windows. Resulting power spectra are then normalized such that they all have unit energy, followed by entropy computation of each power spectrum. These entropies are further normalized because they are computed over different number of sample points. Amongst these power spectra, the one with the minimum normalized entropy is retained as optimal power spectrum estimate. In experiments with speech signals, it is shown that the proposed multi-scale Fourier transform based features yield an increase in speech recognition performance in various non-stationary noise conditions when compared directly to single fixed scale Fourier transform based features.
Martin Vetterli, Thierry Blu, Hanjie Pan