Publication

Using KL-divergence and multilingual information to improve ASR for under-resourced languages

Hervé Bourlard, Philip Neil Garner, David Imseng
2012
Conference paper

Abstract

Setting out from the point of view that automatic speech recognition (ASR) ought to benefit from data in languages other than the target language, we propose a novel Kullback-Leibler (KL) divergence based method that is able to exploit multilingual information in the form of universal phoneme posterior probabilities conditioned on the acoustics. We formulate a means to train a recognizer on several different languages, and subsequently recognize speech in a target language for which only a small amount of data is available. Taking the Greek SpeechDat(II) data as an example, we show that the proposed formulation is sound, and show that it is able to outperform a current state-of-the-art HMM/GMM system. We also use a hybrid Tandem-like system to further understand the source of the benefit.

Official source

https://infoscience.epfl.ch/record/192453?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (5)

Related publications (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Using KL-divergence and multilingual information to improve ASR for under-resourced languages

Graph Chatbot

Chat with Graph Search

Sparse Autoencoders for Speech Modeling and Recognition

Fair Voice Biometrics: Impact of Demographic Imbalance on Group Fairness in Speaker Recognition

AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS

Sparse Autoencoders for Speech Modeling and Recognition

Fair Voice Biometrics: Impact of Demographic Imbalance on Group Fairness in Speaker Recognition

AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS