Tuning-Robust Initialization Methods for Speaker Diarization
Publications associées (59)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to th ...
We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM) ...
Speaker diarization of meetings can be significantly improved by overlap handling. Several previous works have explored the use of different features such as spectral, spatial and energy for overlap detection. This paper proposes a method to estimate proba ...
This paper investigates a typical Speaker Diarization system regarding its robustness against initialization parameter variation and presents a method to reduce manual tuning of these values significantly. The behavior of an agglomerative hierarchical clus ...
We introduce a fast approach to classification and clustering applicable to high-dimensional continuous data, based on Bayesian mixture models for which explicit computations are available. This permits us to treat classification and clustering in a single ...
Cross-lingual speaker adaptation (CLSA) has emerged as a new challenge in statistical parametric speech syn- thesis, with specific application to speech-to-speech translation. Recent research has shown that reasonable speaker similarity can be achieved in ...
The following article presents a novel, adaptive initialization scheme that can be applied to most state-ofthe-art Speaker Diarization algorithms, i.e. algorithms that use agglomerative hierarchical clustering with Bayesian Information Criterion (BIC) and ...
This paper presents a system for keyword detection in spontaneous speech. Keywords are predefined through a set of acoustic examples provided by the users. Keyword detection proceeds in two steps: keyword searching and verification. To address the problem ...
We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to th ...
The following article presents a novel, adaptive initialization scheme that can be applied to most state-ofthe-art Speaker Diarization algorithms, i.e. algorithms that use agglomerative hierarchical clustering with Bayesian Information Criterion (BIC) and ...