Publication

Subband-Based Speech Recognition in Noisy Conditions: The Full Combination Approach

Publications associées (40)

Support Vector Machines with a Reject Option

We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow’s rule, is defined by two thresholds on posterior probabilities. From simple des ...
2008

Using KL-based Acoustic Models in a Large Vocabulary Recognition Task

Hervé Bourlard, Guillermo Aradilla

Posterior probabilities of sub-word units have been shown to be an effective front-end for ASR. However, attempts to model this type of features either do not benefit from modeling context-dependent phonemes, or use an inefficient distribution to estimate ...
IDIAP2008

Combining Evidence from a Generative and a Discriminative Model in Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mixture model, and the posterior probability of phonemes from a discriminative multilayered perceptron in multi-stream combination for recognition of phonemes. ...
2008

Combining Evidence from a Generative and a Discriminative Model in Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

We investigate the use of the log-likelihood of the features obtained from a generative Gaussian mixture model, and the posterior probability of phonemes from a discriminative multilayered perceptron in multi-stream combination for recognition of phonemes. ...
IDIAP2008

Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator

Hynek Hermansky, Joel Praveen Pinto

We present a framework to apply Volterra series to analyze multilayered perceptrons trained to estimate the posterior probabilities of phonemes in automatic speech recognition. The identified Volterra kernels reveal the spectro-temporal patterns that are l ...
Idiap2008

Comparing Different Word Lattice Rescoring Approaches Towards Keyword Spotting

Hervé Bourlard, Hynek Hermansky, Joel Praveen Pinto

In this paper, we further investigate the large vocabulary continuous speech recognition approach to keyword spotting. Given a speech utterance, recognition is performed to obtain a word lattice. The posterior probability of keyword hypotheses in the latti ...
IDIAP2007

On Confusions in a Phoneme Recognizer

Hynek Hermansky, Joel Praveen Pinto

In this paper, we analyze the confusions patterns at three places in the hybrid phoneme recognition system. The confusions are analyzed at the pronunciation, the posterior probability, and the phoneme recognizer levels. The confusions show significant stru ...
2007

On Confusions in a Phoneme Recognizer

Hynek Hermansky, Joel Praveen Pinto

In this paper, we analyze the confusions patterns at three places in the hybrid phoneme recognition system. The confusions are analyzed at the pronunciation, the posterior probability, and the phoneme recognizer levels. The confusions show significant stru ...
IDIAP2007

The Boolean Solution to the Congested IP Link Location Problem: Theory and Practice

Patrick Thiran, Xuan Hung Nguyen

Like other problems in network tomography or traffic matrix estimation, the location of congested IP links from end-to-end measurements requires solving a system of equations that relate the measurement outcomes with the variables representing the status o ...
2007

Identifying unexpected words using in-context and out-of-context phoneme posteriors

Hynek Hermansky, Hamed Ketabdar

The paper proposes and discusses a machine approach for identification of unexpected (zero or low probability) words. The approach is based on use of two parallel recognition channels, one channel employing sensory information from the speech signal togeth ...
IDIAP2006

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.