Sparse Hidden Markov Models for Exemplar-based Speech Recognition Using Deep Neural Network Posterior Features
Related publications (46)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Deep neural networks have been empirically successful in a variety of tasks, however their theoretical understanding is still poor. In particular, modern deep neural networks have many more parameters than training data. Thus, in principle they should over ...
Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...
Training deep neural networks with the error backpropagation algorithm is considered implausible from a biological perspective. Numerous recent publications suggest elaborate models for biologically plausible variants of deep learning, typically defining s ...
A common pattern of progress in engineering has seen deep neural networks displacing human-designed logic. There are many advantages to this approach, divorcing decisionmaking from human oversight and intuition has costs as well. One is that deep neural ne ...
We derive generalization and excess risk bounds for neural networks using a family of complexity measures based on a multilevel relative entropy. The bounds are obtained by introducing the notion of generated hierarchical coverings of neural networks and b ...
We propose an information theoretic framework for quantitative assessment of acoustic models used in hidden Markov model (HMM) based automatic speech recognition (ASR). The HMM backend expects that (i) the acoustic model yields accurate state conditional e ...
Two distinct limits for deep learning have been derived as the network width h -> infinity, depending on how the weights of the last layer scale with h. In the neural tangent Kernel (NTK) limit, the dynamics becomes linear in the weights and is described b ...
This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that ...
In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional l ...
Towards the goal of improving acoustic modeling for automatic speech recognition (ASR), this work investigates the modeling of senone subspaces in deep neural network (DNN) posteriors using low-rank and sparse modeling approaches. While DNN posteriors are ...