INTEGRATING ARTICULATORY FEATURES USING KULLBACK-LEIBLER DIVERGENCE BASED ACOUSTIC MODEL FOR PHONEME RECOGNITION
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Automatic speech recognition (ASR) is a fascinating area of research towards realizing humanmachine interactions. After more than 30 years of exploitation of Gaussian Mixture Models (GMMs), state-of-the-art systems currently rely on Deep Neural Network (DN ...
Idiap2016
, ,
This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic an ...
Idiap2016
,
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
Overlapping speech has been identified as one of the main sources of errors in diarization of meeting room conversations. Therefore, overlap detection has become an important step prior to speaker diarization. Studies on conversational analysis have shown ...
This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic an ...
In the context of hybrid HMM/MLP Automatic Speech Recognition (ASR), this paper describes an investigation into a new type of stochastic phone space transformation, which maps "source" phone (or phone HMM state) posterior probabilities (as obtained at the ...
Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit av ...
Idiap2012
Speaker diarization is the task of identifying ``who spoke when'' in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization sys ...
EPFL2015
,
One of the main challenge in non-native speech recognition is how to handle acoustic variability present in multiaccented non-native speech with limited amount of training data. In this paper, we investigate an approach that addresses this challenge by usi ...
Idiap2012
, ,
Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit av ...