Automatic Speech Recognition using Dynamic Bayesian Networks with both Acoustic and Articulatory Variables
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
We present a theoretical investigation into the use of normalised artificial neural network (ANN) outputs in the context of hidden Markov models (HMMs). The work is motivated by the pursuit of a more theoretically rigorous understanding of the Kullback-Lie ...
Using phone posterior probabilities has been increasingly explored for improving automatic speech recognition (ASR) systems. In this paper, we propose two approaches for hierarchically enhancing these phone posteriors, by integrating long acoustic context, ...
Objective assessment of synthetic speech intelligibility can be a useful tool for the development of text-to-speech (TTS) systems, as it provides a reproducible and inexpensive alternative to subjective listening tests. In a recent work, it was shown that ...
There is magic (or is it witchcraft?) in a speech recognizer that transcribes continuous radio speech into text with a word accuracy of even not more than 50%. The extreme difficulty of this task, tough, is usually not perceived by the general public. This ...
Objective assessment of synthetic speech intelligibility can be a useful tool for the development of text-to-speech (TTS) systems, as it provides a reproducible and inexpensive alternative to subjective listening tests. In a recent work, it was shown that ...
Automatic non-native accent assessment has potential benefits in language learning and speech technologies. The three fundamental challenges in automatic accent assessment are to characterize, model and assess individual variation in speech of the non-nati ...
In this paper we propose two alternatives to overcome the natural asynchrony of modalities in Audio-Visual Speech Recognition. We first investigate the use of asynchronous statistical models based on Dynamic Bayesian Networks with different levels of async ...
This research takes place in the general context of improving the performance of the Distant Speech Recognition (DSR) systems, tackling the reverberation and recognition of overlap speech. Perceptual modeling indicates that sparse representation exists in ...
In this thesis, methods and models are developed and presented aiming at the estimation, restoration and transformation of the characteristics of human speech. During a first period of the thesis, a concept was developed that allows restoring prosodic voic ...