Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.
Statistical speech recognition has been cast as a natural realization of the compressive sensing problem in this work. The compressed acoustic observations are sub-word posterior probabilities obtained from a deep neural network. Dictionary learning and sparse recovery are exploited for inference of the high-dimensional sparse word posterior probabilities. This formulation amounts to realization of a \textit{sparse} hidden Markov model where each state is characterized by a dictionary learned from training exemplars and the emission probabilities are obtained from sparse representations of test exemplars. This new dictionary-based speech processing paradigm alleviates the need for a huge collection of exemplars as required in the conventional exemplar-based methods. We study the performance of the proposed approach for continuous speech recognition using Phonebook and Numbers'95 database.
Loading
Loading
Loading
Loading
Loading
Afsaneh Asaei, Hervé Bourlard, Pranay Dighe
Afsaneh Asaei, Hervé Bourlard, Pranay Dighe
Afsaneh Asaei, Hervé Bourlard, Pranay Dighe