Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Tandem systems transform the cepstral features into posterior probabilities of subword units using artificial neural networks (ANNs), which are processed to form input features for conventional speech recognition systems. They have been shown to perform be ...
This paper investigates an approach that maximizes the joint posterior probabil ity of the pronounced word and the speaker identity given the observed data. This probability can be expressed as a product of the posterior probability of the pronounced word ...
Automatic speech/music discrimination has been receiving importance recently, for example when large multimedia documents have to be processed by an ASR system, or for indexing and retrieval of such documents. This work presents using outputs of a speech r ...
In this paper we develop different mathematical models in the framework of the multi-stream paradigm for noise robust ASR, and discuss their close relationship with human speech perception. Largely inspired by Fletcher's "product-of-errors" rule in psychoa ...
Local state (or phone) posterior probabilities are often investigated as local classifiers (e.g., hybrid HMM/ANN systems) or as transformed acoustic features (e.g., ``Tandem'') towards improved speech recognition systems. In this paper, we present initial ...
In this paper, we present and investigate a new method for subband-based Automatic Speech Recognition (ASR) which approximates the ideal full combination' approach which is itself often not practical to realize. The full combination' approach consists of ...
Tandem systems transform the cepstral features into posterior probabilities of subword units using artificial neural networks (ANNs), which are processed to form input features for conventional speech recognition systems. They have been shown to perform be ...
This paper investigates an approach that maximizes the joint posterior probabil ity of the pronounced word and the speaker identity given the observed data. This probability can be expressed as a product of the posterior probability of the pronounced word ...
Standard ASR systems typically use phoneme as the subword units. Preliminary studies have shown that the performance of the ASR system could be improved by using grapheme as additional subword units. In this paper, we investigate such a system where the wo ...
In this paper we develop different mathematical models in the framework of the multi-stream paradigm for noise robust ASR, and discuss their close relationship with human speech perception. Largely inspired by Fletcher's "product-of-errors" rule in psychoa ...