An Information Theoretic Approach to Speaker Diarization of Meeting Data
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Information theory has been used as an organizing principle in neuroscience for several decades. Estimates of the mutual information (MI) between signals acquired in neurophysiological experiments are believed to yield insights into the structure of the un ...
Here, I review facts that are most probably known, namely that the information gain criterion used to drive experimental design in a linear-Gaussian model is submodular, so that a well-known approximation guarantee holds for the sequential greedy algorithm ...
The Ecole Polytechnique Fédérale de Lausanne is a science and technology university building its « Rolex Learning Center », which will be inaugurated in 2010. Point of entry to EPFL, the Rolex Learning Center (RLC) will be a place where people canl learn, ...
We investigate the invariance of posterior features estimated using MLP trained on auxiliary corpus towards different data condition and different distance measures for matching posterior features in the context of template-based ASR. Through ASR studies o ...
We consider lossy source compression of a binary symmetric source using polar codes and the low-complexity successive encoding algorithm. It was recently shown by Arikan that polar codes achieve the capacity of arbitrary symmetric binary-input discrete mem ...
Institute of Electrical and Electronics Engineers2010
Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...
The role of audio–visual speech synchrony for speaker diarisation is investigated on the multiparty meeting domain. We measured both mutual information and canonical correlation on different sets of audio and video features. As acoustic features we conside ...
A speaker diarization system based on an information theoretic framework is described. The problem is formulated according to the Information Bottleneck (IB) principle. Unlike other approaches where the distance between speaker segments is arbitrarily intr ...
This thesis focuses on the decisional process of autonomous systems, and more particularly, on the way to take a decision when the time at disposal in order to assess the whole situation is shorter than necessary. Indeed, numerous systems propose solutions ...
In the meeting case scenario, audio is often recorded using Multiple Distance Microphones (MDM) in a non-intrusive manner. Typically a beamforming is performed in order to obtain a single enhanced signal out of the multiple channels. This paper investigate ...