Publication

Automatic Speech Recognition using Dynamic Bayesian Networks with the Energy as an Auxiliary Variable

2003
Report or working paper
Abstract

In current automatic speech recognition (ASR) systems, the energy is not used as part of the feature vector in spite of being a fundamental feature in the speech signal. The noise inherent in its estimation degrades the system performance. In this report we present an alternative approach for introducing the energy into the system so that it can help to enhance recognition. We present the experimental results of an ASR system based on dynamic Bayesian networks (DBNs) using the energy as an auxiliary variable. DBNs belong to the same family of statistical models as hidden Markov models (HMMs). However, DBNs are a more general framework and they allow more flexibility in defining new probabilistic relations between variables. We tried different network topologies and we noticed the benefit of conditioning the feature vector on the energy. Furthermore, hiding the value of the energy in recognition also improved the recognition performance.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.