Publication

Using the Multi-Stream Approach for Continuous Audio-Visual Speech Recognition

Related publications (33)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Efficient integration of automated speech recognition in the framework of dialogue-based vocal systems

In this work, we propose different strategies for efficiently integrating an automated speech recognition module in the framework of a dialogue-based vocal system. The aim is the study of different ways leading to the improvement of the quality and robustn ...

EPFL2005

Improving Speech Recognition Using a Data-Driven Approach

Hervé Bourlard, Guillermo Aradilla

In this paper, we investigate the possibility of enhancing state-of-the-art HMM-based speech recognition systems using data-driven techniques, where whole set of training utterances is used as reference models and recognition is then performed through the ...

IDIAP2005

Improving Speech Recognition Using a Data-Driven Approach

Hervé Bourlard, Guillermo Aradilla

2005

Spectro-Temporal Activity Pattern (STAP) Features for Noise Robust ASR

Hervé Bourlard, Hemant Misra, Shajith Ikbal

In this paper, we introduce a new noise robust representation of speech signal obtained by locating points of potential importance in the spectrogram, and parameterizing the activity of time-frequency pattern around those points. These features are referre ...

2004

Spectro-Temporal Activity Pattern (STAP) Features for Noise Robust ASR

Hervé Bourlard, Hemant Misra, Shajith Ikbal

IDIAP2004

Acoustic Echo Cancellation for Human-Robot Communications

Jérôme Berclaz

This master thesis presents a new efficient method of acoustic echo cancellation targeted at speech recognition for robots. The proposed algorithm features a new double-talk detector, an enhanced initialization and a new noise estimation method. The DTD al ...

2004

Using pitch frequency information in speech recognition

Hervé Bourlard

Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with pitch frequency could improve the system performance of the system. ...

IDIAP2003

Using pitch frequency information in speech recognition

Hervé Bourlard

2003

Automatic Speech Recognition using Pitch Information in Dynamic Bayesian Networks

Hervé Bourlard

The challenge of automatic speech recognition (ASR) increases when speaker variability is encountered. Being able to automatically use different acoustic models according to speaker type might help to increase the robustness of ASR. We present a system tha ...

IDIAP2000

Reconnaissance et transformation de locuteurs

Dominique Genoud

This PhD thesis tries to understand how to analyse, decompose, model and transform the vocal identity of a human when seen through an automatic speaker recognition application. It starts with an introduction explaining the properties of the speech signal a ...

EPFL1999