Posterior-based Sparse Representation for Automatic Speech Recognition
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
This paper addresses the problem of detecting speech utterances from a large audio archive using a simple spoken query, hence referring to this problem as "Query by Example Spoken Term Detection" (QbE-STD). This still open pattern matching problem has been ...
This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. Current state-of-the-art approaches to tackle this problem rely on dynamic programming based template matching techniques using phone posterior ...
Modeling directly raw waveform through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different t ...
Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector and relevance Maximum-a-Posteriori (MAP), have shown to provide state-of-the-art performance for text-dependent systems with fixed phrases. The performance o ...
In this paper, we introduce a novel approach for Language Identification (LID). Two commonly used state-of-the-art methods based on UBM/GMM I-vector technique, combined with a back-end classifier, are first evaluated. The differential factor between these ...
This paper describes a high performance innovative and sustainable Speaker Identification (SID) solution, running over large voice samples database. The solution is based on development, integration and fusion of a series of speech analytic algorithms whic ...
Over these last few years, the use of Artificial Neural Networks (ANNs), now often referred to as deep learning or Deep Neural Networks (DNNs), has significantly reshaped research and development in a variety of signal and information processing tasks. Whi ...
This paper describes SIIP (Speaker Identification Integrated Project) a high performance innovative and sustainable Speaker Identification (SID) solution, running over large voice samples database. The solution is based on development, integration and fusi ...
We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low- dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse represen ...