Publication

Automatic Speech Recognition using Dynamic Bayesian Networks with the Energy as an Auxiliary Variable

Related publications (104)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

A multimodal pattern recognition framework for speaker detection

Patricia Besson

Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexin ...

EPFL2007

Exploiting Contextual Information for Improved Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

In this paper, we investigate the significance of contextual information in a phoneme recognition system using the hidden Markov model - artificial neural network paradigm. Contextual information is probed at the feature level as well as at the output of t ...

IDIAP2007

A Bayesian Switching Linear Dynamical System for Scale-Invariant robust speech extraction

Bertrand Mesot

Most state-of-the-art automatic speech recognition (ASR) systems deal with noise in the environment by extracting noise robust features which are subsequently modelled by a Hidden Markov Model (HMM). A limitation of this feature-based approach is that the ...

IDIAP2007

Correcting Confusion Matrices for Phone Recognizers

Modern speech recognition has many ways of quantifying the misrecognitions a speech recognizer makes. The errors in modern speech recognition makes extensive use of the Levenshtein algorithm to find the distance between the labeled target and the recognize ...

IDIAP2007

A Novel Statistical Generative Model Dedicated To Face Recognition

Sébastien Marcel, Guillaume Heusch

In this paper, a novel statistical generative model to describe a face is presented, and is applied on the face authentication task. Classical generative models used so far in face recognition, such as Gaussian Mixture Models (GMM) and Hidden Markov Models ...

IDIAP2007

Using Pitch as Prior Knowledge in Template-Based Speech Recognition

Hervé Bourlard, Guillermo Aradilla

In a previous paper on speech recognition, we showed that templates can better capture the dynamics of speech signal compared to parametric models such as hidden Markov models. The key point in template matching approaches is finding the most similar templ ...

2006

A Bayesian Alternative to Gain Adaptation in Autoregressive Hidden Markov Models

Bertrand Mesot

Models dealing directly with the raw acoustic speech signal are an alternative to conventional feature-based HMMs. A popular way to model the raw speech signal is by means of an autoregressive (AR) process. Being too simple to cope with the nonlinearity of ...

IDIAP2006

Novel speech processing techniques for robust automatic speech recognition

Vivek Tyagi

The goal of this thesis is to develop and design new feature representations that can improve the automatic speech recognition (ASR) performance in clean as well noisy conditions. One of the main shortcomings of the fixed scale (typically 20-30 ms long ana ...

EPFL2006

Simple Form Recognition using Bayesian Programming

Roland Siegwart, Adriana Tapus, Guy Ramel, François Aspert

The environment that surrounds us is very complex. Understanding and interpreting it is a very hard task. This paper proposes an approach allowing simple form recognition with a camera by using a probabilistic approach called Bayesian Programming. The main ...

2006

Multimedia event modelling and recognition

Mark Barnard

The recognition of events in multimedia data is a challenging area of research. The growth in the amount of multimedia data being produced and stored increases the need for systems capable of automatically analysing this data. This analysis can aid in effi ...

EPFL2005