KL Realignment for Speaker Diarization with Multiple Feature Streams
Related publications (35)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
In this paper, we present a method for integrating possible prior knowledge (such as phonetic and lexical knowledge), as well as acoustic context (e.g., the whole utterance) in the phone posterior estimation, and we propose to use the obtained posteriors a ...
Using phone posterior probabilities has been increasingly explored for improving automatic speech recognition (ASR) systems. In this paper, we propose two approaches for hierarchically enhancing these phone posteriors, by integrating long acoustic context, ...
In this thesis, the framework of multi-stream combination has been explored to improve the noise robustness of automatic speech recognition (ASR) systems. The central idea of multi-stream ASR is to combine information from several sources to improve the pe ...
In this thesis, the framework of multi-stream combination has been explored to improve the noise robustness of automatic speech recognition (ASR) systems. The central idea of multi-stream ASR is to combine information from several sources to improve the pe ...
Posterior probabilities of sub-word units have been shown to be an effective front-end for ASR. However, attempts to model this type of features either do not benefit from modeling context-dependent phonemes, or use an inefficient distribution to estimate ...
In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utt ...
In this thesis, the framework of multi-stream combination has been explored to improve the noise robustness of automatic speech recognition (ASR) systems. The central idea of multi-stream ASR is to combine information from several sources to improve the pe ...
The one-dimensional fragmentation of a dynamically expanding ring (Mott's problem) is studied numerically to obtain the fragment signatures under different strain rates. An empirical formula is proposed to calculate an average fragment size. Rayleigh distr ...
A navigation process is to start from a known (initial) position and to ensure a continued localisation of the user during the movement. Consider a pedestrian navigation system which contains a GPS receiver and a set of inertial sensors connected with the ...
In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utt ...