Publication

Acoustic models for posterior features in speech recognition

Publications associées (135)

Acoustic Models for Posterior Features in Speech Recognition

Guillermo Aradilla

In this thesis, we investigate the use of posterior probabilities of sub-word units directly as input features for automatic speech recognition (ASR). These posteriors, estimated from data-driven methods, display some favourable properties such as increase ...
Ecole Polytechnique Fédérale de Lausanne2008

Acoustic Models for Posterior Features in Speech Recognition

Guillermo Aradilla

In this thesis, we investigate the use of posterior probabilities of sub-word units directly as input features for automatic speech recognition (ASR). These posteriors, estimated from data-driven methods, display some favourable properties such as increase ...
Idiap2008

Using KL-based Acoustic Models in a Large Vocabulary Recognition Task

Hervé Bourlard, Guillermo Aradilla

Posterior probabilities of sub-word units have been shown to be an effective front-end for ASR. However, attempts to model this type of features either do not benefit from modeling context-dependent phonemes, or use an inefficient distribution to estimate ...
IDIAP2008

Using Comparison of Parallel Phoneme Probability streams for OOV Word Detection

Mathew Magimai Doss, Hynek Hermansky, Tamara Tosic

In this paper, we investigate the approach of comparing two different parallel streams of phoneme posterior probability estimates for OOV word detection. The first phoneme posterior probability stream is estimated using only the knowledge of short-term acou ...
2008

Combining Evidence from a Generative and a Discriminative Model in Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mixture model, and the posterior probability of phonemes from a discriminative multilayered perceptron in multi-stream combination for recognition of phonemes. ...
2008

Combining Evidence from a Generative and a Discriminative Model in Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mixture model, and the posterior probability of phonemes from a discriminative multilayered perceptron in multi-stream combination for recognition of phonemes. ...
IDIAP2008

Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator

Hynek Hermansky, Joel Praveen Pinto

We present a framework to apply Volterra series to analyze multilayered perceptrons trained to estimate the posterior probabilities of phonemes in automatic speech recognition. The identified Volterra kernels reveal the spectro-temporal patterns that are l ...
Idiap2008

Bayesian Inference and Optimal Design in the Sparse Linear Model

Matthias Seeger

The linear model with sparsity-favouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertain ...
Massachusetts Institute of Technology Press2008

Using entropy as a stream reliability estimate for audio-visual speech recognition

Jean-Philippe Thiran, Mihai Gurban

We present a method for dynamically integrating audio-visual information for speech recognition, based on the estimated reliability of the audio and visual streams. Our method uses an information theoretic measure, the entropy derived from the state probab ...
2008

Posterior Features Applied to Speech Recognition Tasks with Limited Training Data

Hervé Bourlard, Guillermo Aradilla

This paper describes an approach where posterior-based features are applied in those recognition tasks where the amount of training data is insufficient to obtain a reliable estimate of the speech variability. A template matching approach is considered in ...
IDIAP2008

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.