Publication

HMM mixtures (HMM2) for robust speech recognition

Publications associées (213)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Predictive Models for Music

Samy Bengio, Jean-François Paiement

Modeling long-term dependencies in time series has proved very difficult to achieve with traditional machine learning methods. This problem occurs when considering music data. In this paper, we introduce generative models for melodies. We decompose melodic ...

IDIAP2008

A multimodal pattern recognition framework for speaker detection

Patricia Besson

Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexin ...

EPFL2007

Robust overlapping speech recognition based on neural networks

John David Scott Dines, Weifeng Li

We address issues for improving hands-free speech recognition performance in the presence of multiple simultaneous speakers using multiple distant microphones. In this paper, a log spectral mapping is proposed to estimate the log mel-filterbank outputs of ...

IDIAP2007

A Bayesian Switching Linear Dynamical System for Scale-Invariant robust speech extraction

Bertrand Mesot

Most state-of-the-art automatic speech recognition (ASR) systems deal with noise in the environment by extracting noise robust features which are subsequently modelled by a Hidden Markov Model (HMM). A limitation of this feature-based approach is that the ...

IDIAP2007

Unsupervised Speech/Non-speech Detection for Automatic Speech Recognition in Meeting Rooms

Daniel Gatica-Perez, Petr Motlicek

The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant ...

2007

Correcting Confusion Matrices for Phone Recognizers

Modern speech recognition has many ways of quantifying the misrecognitions a speech recognizer makes. The errors in modern speech recognition makes extensive use of the Levenshtein algorithm to find the distance between the labeled target and the recognize ...

IDIAP2007

Exploiting Contextual Information for Improved Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

In this paper, we investigate the significance of contextual information in a phoneme recognition system using the hidden Markov model - artificial neural network paradigm. Contextual information is probed at the feature level as well as at the output of t ...

IDIAP2007

Relevant Feature Selection for Audio-Visual Speech Recognition

Jean-Philippe Thiran, Mihai Gurban, Thomas Drugman

We present a feature selection method based on information theoretic measures, targeted at multimodal signal processing, showing how we can quantitatively assess the relevance of features from different modalities. We are able to find the features with the ...

2007

Non-linear Spectral Contrast Stretching for In-car Speech Recognition

Hervé Bourlard, Weifeng Li

In this paper, we present a novel feature normalization method in the log-scaled spectral domain for improving the noise robustness of speech recognition front-ends. In the proposed scheme, a non-linear contrast stretching is added to the outputs of log me ...

2007

Non-linear Spectral Contrast Stretching for In-car Speech Recognition

Hervé Bourlard, Weifeng Li

IDIAP2007