Publication

Integrating articulatory features using Kullback-Leibler divergence based acoustic model for phoneme recognition

Related publications (37)

Template-matching for text-dependent speaker verification

Petr Motlicek, Subhadeep Dey

In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
2017

Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures

Hervé Bourlard, Milos Cernak, Afsaneh Asaei

This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic an ...
Idiap2016

Low-Rank Representation For Enhanced Deep Neural Network Acoustic Models

Automatic speech recognition (ASR) is a fascinating area of research towards realizing humanmachine interactions. After more than 30 years of exploitation of Gaussian Mixture Models (GMMs), state-of-the-art systems currently rely on Deep Neural Network (DN ...
Idiap2016

Efficient Posterior Exemplar Search Space Hashing Exploiting Class-Specific Sparsity Structures

Hervé Bourlard, Milos Cernak, Afsaneh Asaei

This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic an ...
2016

Speaker diarization of spontaneous meeting room conversations

Sree Harsha Yella

Speaker diarization is the task of identifying ``who spoke when'' in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization sys ...
EPFL2015

Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations

Hervé Bourlard, Sree Harsha Yella

Overlapping speech has been identified as one of the main sources of errors in diarization of meeting room conversations. Therefore, overlap detection has become an important step prior to speaker diarization. Studies on conversational analysis have shown ...
2014

Applying Multi- and Cross-Lingual Stochastic Phone Space Transformations to Non-Native Speech Recognition

Hervé Bourlard, Philip Neil Garner, John David Scott Dines, David Imseng

In the context of hybrid HMM/MLP Automatic Speech Recognition (ASR), this paper describes an investigation into a new type of stochastic phone space transformation, which maps "source" phone (or phone HMM state) posterior probabilities (as obtained at the ...
Ieee-Inst Electrical Electronics Engineers Inc2013

Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition

David Imseng, Ramya Rasipuram

One of the main challenge in non-native speech recognition is how to handle acoustic variability present in multiaccented non-native speech with limited amount of training data. In this paper, we investigate an approach that addresses this challenge by usi ...
Idiap2012

Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans

Hervé Bourlard, Philip Neil Garner, David Imseng

Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit av ...
Idiap2012

Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans

Hervé Bourlard, Philip Neil Garner, David Imseng

Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit av ...
2012

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.