Publication

Speaker diarization of spontaneous meeting room conversations

Related publications (77)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Unsupervised Speech/Non-speech Detection for Automatic Speech Recognition in Meeting Rooms

Daniel Gatica-Perez, Petr Motlicek

The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant ...

IDIAP2006

Using Pitch as Prior Knowledge in Template-Based Speech Recognition

Hervé Bourlard, Guillermo Aradilla

In a previous paper on speech recognition, we showed that templates can better capture the dynamics of speech signal compared to parametric models such as hidden Markov models. The key point in template matching approaches is finding the most similar templ ...

2006

Using more informative posterior probabilities for speech recognition

Hervé Bourlard, Samy Bengio, Hamed Ketabdar

In this paper, we present initial investigations towards boosting posterior probability based speech recognition systems by estimating more informative posteriors taking into account acoustic context (e.g., the whole utterance), as well as possible prior i ...

2006

Speaker recognition in noisy environments using auxiliary information and Bayesian networks

Speaker recognition systems achieve acceptable performance in controlled laboratory conditions. However, in real-life environments, the performance of a speaker recognition system degrades drastically, the principal cause being the mismatch that exists bet ...

EPFL2006

Spatio-Temporal Analysis of Spontaneous Speech with Microphone Arrays

Guillaume Lathoud

Accurate detection, localization and tracking of multiple moving speakers permits a wide spectrum of applications. Techniques are required that are versatile, robust to environmental variations, and not constraining for non-technical end-users. Based on di ...

IDIAP2006

Spatio-Temporal Analysis of Spontaneous Speech with Microphone Arrays

Guillaume Lathoud

École Polytechnique Fédérale de Lausanne2006

A Max Kernel For Text-Independent Speaker Verification Systems

Samy Bengio

In this paper, we present a principled SVM based speaker verification system. A general approach to compute two sequences of frames is developed that enables the use of any kernel at the frame level. An extension of this approach using the Max operator is ...

2006

Robust audio segmentation

Jitendra Ajmera

Audio segmentation, in general, is the task of segmenting a continuous audio stream in terms of acoustically homogenous regions, where the rule of homogeneity depends on the task. This thesis aims at developing and investigating efficient, robust and unsup ...

EPFL2005

Using Pitch as Prior Knowledge in Template-Based Speech Recognition

Hervé Bourlard, Guillermo Aradilla

IDIAP2005

The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments

The recognition of speech in meetings poses a number of challenges to current Automatic Speech Recognition (ASR) techniques. Meetings typically take place in rooms with non-ideal acoustic conditions and significant background noise, and may contain large s ...

IDIAP2005