Integrated Pronunciation Learning for Automatic Speech Recognition Using Probabilistic Lexical Modeling

Standard automatic speech recognition (ASR) systems use phoneme-based pronunciation lexicon prepared by linguistic experts. When the hand crafted pronunciations fail to cover the vocabulary of a new domain, a grapheme-to-phoneme (G2P) converter is used to extract pronunciations for new words and then a phoneme- based ASR system is trained. G2P converters are typically trained only on the existing lexicons. In this paper, we propose a grapheme-based ASR approach in the framework of probabilistic lexical modeling that integrates pronunciation learning as a stage in ASR system training, and exploits both acoustic and lexical resources (not necessarily from the domain or language of interest). The proposed approach is evaluated on four lexical resource constrained ASR tasks and compared with the conventional two stage approach where G2P training is followed by ASR system development.

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Integrated Pronunciation Learning for Automatic Speech Recognition Using Probabilistic Lexical Modeling

Graph Chatbot

Chattez avec Graph Search

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

On matching data and model in LF-MMI-based dysarthric speech recognition

A COMPARISON OF METHODS FOR OOV-WORD RECOGNITION ON A NEW PUBLIC DATASET

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

On matching data and model in LF-MMI-based dysarthric speech recognition

A COMPARISON OF METHODS FOR OOV-WORD RECOGNITION ON A NEW PUBLIC DATASET