Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
In this paper, we introduce a novel algorithm to perform multi-scale Fourier transform analysis of piecewise stationary signals with application to automatic speech recognition. Such signals are composed of quasi-stationary segments of variable lengths. Therefore, in the proposed algorithm, signals are analyzed with multiple-sized windows. Resulting power spectra are then normalized such that they all have unit energy, followed by entropy computation of each power spectrum. These entropies are further normalized because they are computed over different number of sample points. Amongst these power spectra, the one with the minimum normalized entropy is retained as optimal power spectrum estimate. In experiments with speech signals, it is shown that the proposed multi-scale Fourier transform based features yield an increase in speech recognition performance in various non-stationary noise conditions when compared directly to single fixed scale Fourier transform based features.
Martin Vetterli, Thierry Blu, Hanjie Pan