Using more informative posterior probabilities for speech recognition
Publications associées (37)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Speaker diarization is the task of identifying ``who spoke when'' in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization sys ...
EPFL2015
Automatic processing of multiparty interactions is a research domain with important applications in content browsing, summarization and information retrieval. In recent years, several works have been devoted to find regular patterns which speakers exhibit ...
In this thesis, we investigate the use of posterior probabilities of sub-word units directly as input features for automatic speech recognition (ASR). These posteriors, estimated from data-driven methods, display some favourable properties such as increase ...
The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
Using phone posterior probabilities has been increasingly explored for improving automatic speech recognition (ASR) systems. In this paper, we propose two approaches for hierarchically enhancing these phone posteriors, by integrating long acoustic context, ...
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR-based adaptation tec ...
2014
, ,
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR- based adaptation te ...
Idiap2012
This work describes a solution to the validation challenge problem posed at the SANDIA Validation Challenge Workshop, May 21-23, 2006, NM. It presents and applies a general methodology to it. The solution entails several standard steps, namely selecting an ...