Publication

Robust audio segmentation

Related publications (173)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Infinite Models for Speaker Clustering

Fabio Valente

In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. ...

2006

Infinite Models for Speaker Clustering

Fabio Valente

IDIAP2006

Combination of Acoustic Classifiers based on Dempster-Shafer Theory of evidence

Hynek Hermansky, Fabio Valente

In this paper we investigate combination of neural net based classifiers using Dempster-Shafer Theory of Evidence. Under some assumptions, combination rule resembles a product of errors rule observed in human speech perception. Different combination are te ...

IDIAP2006

Using more informative posterior probabilities for speech recognition

Hervé Bourlard, Samy Bengio, Hamed Ketabdar

In this paper, we present initial investigations towards boosting posterior probability based speech recognition systems by estimating more informative posteriors taking into account acoustic context (e.g., the whole utterance), as well as possible prior i ...

2006

Automatic genre classification of music content

Giorgio Zoia, Nicolas Scaringella

This paper reviews the state-of-the-art in automatic genre classification of music collections through three main paradigms: expert systems, unsupervised classification, and supervised classification. The paper discusses the importance of music genres with ...

2006

Multimedia event modelling and recognition

Mark Barnard

The recognition of events in multimedia data is a challenging area of research. The growth in the amount of multimedia data being produced and stored increases the need for systems capable of automatically analysing this data. This analysis can aid in effi ...

EPFL2005

The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiments

The recognition of speech in meetings poses a number of challenges to current Automatic Speech Recognition (ASR) techniques. Meetings typically take place in rooms with non-ideal acoustic conditions and significant background noise, and may contain large s ...

IDIAP2005

Using more informative posterior probabilities for speech recognition

Hervé Bourlard, Samy Bengio, Hamed Ketabdar

IDIAP2005

Aural and automatic forensic speaker recognition in mismatched conditions

Andrzej Drygajlo, Anil Alexander

In this article, we compare aural and automatic speaker recognition in the context of forensic analyses, using a Bayesian framework for the interpretation of evidence. We use perceptual tests performed by non-experts and compare their performance with that ...

2005

Robust Audio Segmentation

Hervé Bourlard, Jitendra Ajmera

Audio segmentation, in general, is the task of segmenting a continuous audio stream in terms of acoustically homogenous regions, where the rule of homogeneity depends on the task. This thesis aims at developing and investigating efficient, robust and unsup ...

IDIAP2004