User-Customized Password Speaker Verification based on HMM/ANN and GMM Models
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The i-vector and Joint Factor Analysis (JFA) systems for text- dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statis- tics average the acoustic information over the utterance ther ...
The i-vector and Joint Factor Analysis (JFA) systems for text- dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statis- tics average the acoustic information over the utterance ther ...
Automatic speech recognition (ASR) is a fascinating area of research towards realizing humanmachine interactions. After more than 30 years of exploitation of Gaussian Mixture Models (GMMs), state-of-the-art systems currently rely on Deep Neural Network (DN ...
We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low- dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse represen ...
Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond ...
We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of lowdimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representa ...
Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it ...
Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it ...
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significa ...
Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector and relevance Maximum-a-Posteriori (MAP), have shown to provide state-of-the-art performance for text-dependent systems with fixed phrases. The performance o ...