Publication

Integrating audio and vision for robust automatic gender recognition

Related publications (36)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Low-Dimensional Motion Features for Audio-Visual Speech Recognition

Jean-Philippe Thiran, Mihai Gurban, Andrés Vallés

Audio-visual speech recognition promises to improve the performance of speech recognizers, especially when the audio is corrupted, by adding information from the visual modality, more specifically, from the video of the speaker. However, the number of visu ...

2007

Blind Audio-Visual Source Separation Using Sparse Redundant Representations

Pierre Vandergheynst, Gianluca Monaci, Anna Llagostera Casanovas

This report presents a new method to confront the Blind Audio Source Separation (BASS) problem, by means of audio and visual information. In a given mixture, we are able to locate the video sources first and, posteriorly, recover each source signal, only w ...

2006

Multimodal Speaker Localization in a Probabilistic Framework

Jean-Philippe Thiran, Mihai Gurban

A multimodal probabilistic framework is proposed for the problem of finding the active speaker in a video sequence. We localize the current speaker's mouth in the image by using the video and the audio channels together. We propose a novel visual feature t ...

IEEE2006

Improved Time Delay Analysis/Synthesis for Parametric Stereo Audio Coding

Christof Faller, Christophe Tournery

For parametric stereo and multi-channel audio coding, it has been proposed to use level difference, time difference, and coherence cues between audio channels to represent the perceptual spatial features of stereo and multi-channel audio signals. In practi ...

2006

Parametric coding of spatial audio

Christof Faller

A wide range of techniques for coding a single speech or audio signal channel have been developed over the last few decades. In addition to pure redundancy reduction, sophisticated source and receiver models have been considered for reducing the bitrate. O ...

EPFL2004

The IDIAP Smart Meeting Room

Darren Moore

The IDIAP Smart Meeting Room is a meeting room equipped with synchronised, multi-channel audio-visual recording facilities. This document presents a detailed description of the room with particular emphasis on the acquisition equipment and the components u ...

IDIAP2002