Sparse Hidden Markov Models for Exemplar-based Speech Recognition Using Deep Neural Network Posterior Features

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Statistical speech recognition has been cast as a natural realization of the compressive sensing problem in this work. The compressed acoustic observations are sub-word posterior probabilities obtained from a deep neural network. Dictionary learning and sparse recovery are exploited for inference of the high-dimensional sparse word posterior probabilities. This formulation amounts to realization of a \textit{sparse} hidden Markov model where each state is characterized by a dictionary learned from training exemplars and the emission probabilities are obtained from sparse representations of test exemplars. This new dictionary-based speech processing paradigm alleviates the need for a huge collection of exemplars as required in the conventional exemplar-based methods. We study the performance of the proposed approach for continuous speech recognition using Phonebook and Numbers'95 database.

Sparse Hidden Markov Models for Exemplar-based Speech Recognition Using Deep Neural Network Posterior Features

Graph Chatbot

Chat with Graph Search

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Deep Learning Generalization with Limited and Noisy Labels

Leveraging Unlabeled Data to Track Memorization

Deep Learning Generalization with Limited and Noisy Labels

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Leveraging Unlabeled Data to Track Memorization