Concept

Enceinte sans fil

Publications associées (8)

Template-matching for text-dependent speaker verification

In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...

2017

System fusion and speaker linking for longitudinal diarization of TV shows

Hervé Bourlard, Petr Motlicek

Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data s ...

IEEE2016

Modified group delay feature based total variability space modelling for speaker recognition

In this paper, modified group delay (MODGD) features are used to model target speakers in the Total Variability Space (TVS) framework for speaker recognition. MODGD based features have been shown to improve speaker recognition performance owing to the abil ...

2015

Lower and upper bounds for approximation of the Kullback-Leibler divergence between Gaussian Mixture Models

Jean-Philippe Thiran, Finnian Paul Kelly

Many speech technology systems rely on Gaussian Mixture Models (GMMs). The need for a comparison between two GMMs arises in applications such as speaker verification, model selection or parameter estimation. For this purpose, the Kullback-Leibler (KL) dive ...

Ieee2012

Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition

Andrzej Drygajlo

Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide ...

Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2009

Visual feature analysis for audio-visual speech recognition

Ivana Arsic de Heras Ciechomska

Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...

EPFL2008

Speaker recognition in noisy environments using auxiliary information and Bayesian networks

Speaker recognition systems achieve acceptable performance in controlled laboratory conditions. However, in real-life environments, the performance of a speaker recognition system degrades drastically, the principal cause being the mismatch that exists bet ...

EPFL2006

Joint speech and speaker recognition

The goal of the thesis is to investigate different approaches that combine and integrate Automatic Speech Recognition (ASR) and Speaker Recognition (SR) systems, with applications to (1) User-Customized Password Speaker Verification (UCP-SV) systems, and, ...

EPFL2005