Publication

Introducing Temporal Asymmetries in Feature Extraction for Automatic Speech Recognition

Related publications (50)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Experimental investigation into localized instabilities of mixed Rayleigh-Bénard-Poiseuille convection

Emeric Grandjean

An experimental study of the stability of the Rayleigh-Bénard-Poiseuille flow was performed in a large transverse aspect ratio channel. The onset for the transverse thermo-convective rolls was determined as a function of the Reynolds number for two differe ...

EPFL2008

Adaptive Beamforming with a Maximum Negentropy Criterion

Philip Neil Garner, Weifeng Li

In this paper, we address an adaptive beamforming application in realistic acoustic conditions. After the position of a speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in generalized sidelobe canceller (GSC) conf ...

2008

Adaptive Beamforming with a Maximum Negentropy Criterion

Philip Neil Garner, Weifeng Li

\begin{abstract} In this paper, we address an adaptive beamforming application in realistic acoustic conditions. After the position of a speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in \emph{generalized sidelo ...

IDIAP2008

A multimodal pattern recognition framework for speaker detection

Patricia Besson

Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexin ...

EPFL2007

Dynamic Measurement of Room Impulse Responses using a Moving Microphone

Martin Vetterli, Luciano Sbaiz, Thibaut Ajdler

A novel technique for the recording of large sets of room impulse responses or head-related transfer functions is presented. The technique uses a microphone or a loudspeaker moving with constant speed. Given a setup (e.g. length of the room impulse respons ...

2007

Towards using slide information to enhance speech transcription of meetings

Hervé Bourlard, Artem Peregoudov, Alessandro Vinciarelli

In this paper we investigate the possibility of improving the speech recognition performance of meeting recordings by using slides captured during the recording process. The key hypothesis exploited in this work is that both slides and speech carry correla ...

IDIAP2006

Novel speech processing techniques for robust automatic speech recognition

Vivek Tyagi

The goal of this thesis is to develop and design new feature representations that can improve the automatic speech recognition (ASR) performance in clean as well noisy conditions. One of the main shortcomings of the fixed scale (typically 20-30 ms long ana ...

EPFL2006

Robust audio segmentation

Jitendra Ajmera

Audio segmentation, in general, is the task of segmenting a continuous audio stream in terms of acoustically homogenous regions, where the rule of homogeneity depends on the task. This thesis aims at developing and investigating efficient, robust and unsup ...

EPFL2005

Nonlinear feature transformations for noise robust speech recognition

Shajith Ikbal

Robustness against external noise is an important requirement for automatic speech recognition (ASR) systems, when it comes to deploying them for practical applications. This thesis proposes and evaluates new feature-based approaches for improving the ASR ...

EPFL2004

Robust Audio Segmentation

Hervé Bourlard, Jitendra Ajmera

IDIAP2004