The performance of speaker recognition systems has considerably improved in the last decade. This is mainly due to the development of Gaussian mixture model-based systems and in particular to the use of i-vectors. These systems handle relatively well noise ...
Speaker verification systems traditionally extract and model cepstral features or filter bank energies from the speech signal. In this paper, inspired by the success of neural network-based approaches to model directly raw speech signal for applications su ...
In this paper, modified group delay (MODGD) features are used to model target speakers in the Total Variability Space (TVS) framework for speaker recognition. MODGD based features have been shown to improve speaker recognition performance owing to the abil ...
The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this pap ...
In this thesis, methods and models are developed and presented aiming at the estimation, restoration and transformation of the characteristics of human speech. During a first period of the thesis, a concept was developed that allows restoring prosodic voic ...
The advent of statistical parametric speech synthesis has paved new ways to a unified framework for hidden Markov model (HMM) based text to speech synthesis (TTS) and automatic speech recognition (ASR). The techniques and advancements made in the field of ...
In this paper, we introduce a new class of noise robust features derived from an alternative measure of autocorrelation representing the phase variation of speech signal frame over time. These features, referred to as Phase AutoCorrelation (PAC) features i ...
This paper investigates robust privacy-sensitive audio features for speaker diarization in multiparty conversations: ie., a set of audio features having low linguistic information for speaker diarization in a single and multiple distant microphone scenario ...
Speaker verification on portable devices like smartphones is gradually becoming popular. In this context, two issues need to be considered: 1) such devices have relatively limited computation resources, and 2) they are liable to be used everywhere, possibl ...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...