Publication

Combining Acoustic Data Driven G2P and Letter-to-Sound Rules for Under Resource Lexicon Generation

Related publications (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Acoustic Data-driven Grapheme-to-Phoneme Conversion using KL-HMM

Ramya Rasipuram

This paper proposes a novel grapheme-to-phoneme (G2P) conversion approach where first the probabilistic relation between graphemes and phonemes is captured from acoustic data using Kullback-Leibler divergence based hidden Markov model (KL-HMM) system. Then ...

Idiap2011

Improving Articulatory Feature and Phoneme Recognition using Multitask Learning

Ramya Rasipuram

Speech sounds can be characterized by articulatory features. Articulatory features are typically estimated using a set of multilayer perceptrons (MLPs), i.e., a separate MLP is trained for each articulatory feature. In this paper, we investigate multitask ...

Springer Berlin / Heidelberg2011

Grapheme-based Automatic Speech Recognition using KL-HMM

Hervé Bourlard, Ramya Rasipuram, Guillermo Aradilla

The state-of-the-art automatic speech recognition (ASR) systems typically use phonemes as subword units. In this work, we present a novel grapheme-based ASR system that jointly models phoneme and grapheme information using Kullback-Leibler divergence-based ...

2011

MULTITASK LEARNING TO IMPROVE ARTICULATORY FEATURE ESTIMATION AND PHONEME RECOGNITION

Ramya Rasipuram

Idiap2011

Overcoming Asynchrony in Audio-Visual Speech Recognition

Jean-Philippe Thiran, Virginia Estellers Casas

In this paper we propose two alternatives to overcome the natural asynchrony of modalities in Audio-Visual Speech Recognition. We first investigate the use of asynchronous statistical models based on Dynamic Bayesian Networks with different levels of async ...

2010

On Joint Modelling of Grapheme and Phoneme Information using KL-HMM for ASR

Hervé Bourlard, Guillermo Aradilla

In this paper, we propose a simple approach to jointly model both grapheme and phoneme information using Kullback-Leibler divergence based HMM (KL-HMM) system. More specifically, graphemes are used as subword units and phoneme posterior probabilities estim ...

Idiap2009

A Kernel Wrapper for Phoneme Sequence Recognition

We describe a kernel wrapper, a Mercer kernel for the task of phoneme sequence recognition which is based on operations with the Gaussian kernel, and suitable for any sequence kernel classifier. We start by presenting a kernel-based algorithm for phoneme s ...

John Wiley and Sons2009

Enhancing posterior based speech recognition systems

Hamed Ketabdar

The use of local phoneme posterior probabilities has been increasingly explored for improving speech recognition systems. Hybrid hidden Markov model / artificial neural network (HMM/ANN) and Tandem are the most successful examples of such systems. In this ...

EPFL2008

Enhancing posterior based speech recognition systems

Hamed Ketabdar

Ecole Polytechnique Fédérale de Lausanne2008

Fast Approximate Spoken Term Detection from Sequence of Phonemes

Hynek Hermansky, Joel Praveen Pinto

We investigate the detection of spoken terms in conversational speech using phoneme recognition with the objective of achieving smaller index size as well as faster search speed. Speech is processed and indexed as a sequence of one best phoneme sequence. W ...

2008