Publication

Clustering And Segmenting Speakers And Their Locations In Meetings

Related publications (34)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Semi-supervised Extraction of Audio-Visual Sources

Patricia Calatayud Martinez

This report presents a semi-supervised method to jointly extract audio-visual sources from a scene. It consist of applying a supervised method to segment the video signal followed by an automatic process to properly separate the audio track. This approach ...

2010

MULTI-MODAL SPEAKER DIARIZATION OF REAL-WORLD MEETINGS USING COMPRESSED-DOMAIN VIDEO FEATURES

Speaker diarization is originally defined as the task of de- termining “who spoke when” given an audio track and no other prior knowledge of any kind. The following article shows a multi-modal approach where we improve a state- of-the-art speaker diarizati ...

2009

Obtaining Binaural Room Impulse Responses from B-Format Impulse Responses

Christof Faller, Fritz Menzer

Given a set of head related transfer functions (HRTFs) and a room impulse response measured with a Soundfield microphone, the proposed technique computes binaural room impulse responses (BRIRs) which are similar to binaural room impulse responses that woul ...

2008

Blind audiovisual source separation using sparse representations

Pierre Vandergheynst, Gianluca Monaci, Anna Llagostera Casanovas

In this work we present a method to jointly separate active audio and visual structures on a given mixture. Blind Audiovisual Source Separation is achieved exploiting the coherence between a video Signal and a one-microphone audio track. The efficient repr ...

Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2007

Blind Audiovisual Source Separation Using Sparse Representations

Pierre Vandergheynst, Gianluca Monaci, Anna Llagostera Casanovas

In this work we present a method to jointly separate active audio and visual structures on a given mixture. Blind Audiovisual Source Separation is achieved exploiting the coherence between a video signal and a one-microphone audio track. The efficient repr ...

2007

Towards using slide information to enhance speech transcription of meetings

Hervé Bourlard, Artem Peregoudov, Alessandro Vinciarelli

In this paper we investigate the possibility of improving the speech recognition performance of meeting recordings by using slides captured during the recording process. The key hypothesis exploited in this work is that both slides and speech carry correla ...

IDIAP2006

Sociometry Based Multiparty Audio Recordings Segmentation

Alessandro Vinciarelli

This paper shows how Social Network Analysis, the sociological domain studying the interaction between people in specific social environments, can be used to assign roles to different speakers in multiparty recordings. The experiments presented in this wor ...

IDIAP2005

Clustering And Segmenting Speakers And Their Locations In Meetings

Guillaume Lathoud, Jitendra Ajmera

This paper presents a new approach toward automatic annotation of meetings in terms of speaker identities and their locations. This is achieved by segmenting the audio recordings using two independent sources of information : magnitude spectrum analysis an ...

2004

Parametric coding of spatial audio

Christof Faller

A wide range of techniques for coding a single speech or audio signal channel have been developed over the last few decades. In addition to pure redundancy reduction, sophisticated source and receiver models have been considered for reducing the bitrate. O ...

EPFL2004

Fractal additive synthesis

Pietro Polotti

Musical and audio signals in general form a major part of the large amount of data exchange taking place in our information-based society. Transmission of high quality audio signals through narrow-band channels, such as the Internet, requires refined metho ...

EPFL2003