Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In \cite{xavi_comb}, it is shown that TDOA can be used as additional features together with conventional spectral features for improving speaker diarization. We investigate here the combination of TDOA and spectral features in a fast diarization system based on the Information Bottleneck principle. We evaluate the algorithm on the NIST RT06 diarization task. Adding TDOA features to spectral features reduces the speaker error by 3% absolute. Results are comparable to those of conventional HMM/GMM based systems with consistent reduction in computational complexity.
Aude Billard, Mikhail Koptev, Nadia Barbara Figueroa Fernandez
Jean-Philippe Thiran, Tobias Kober, Bénédicte Marie Maréchal, Jonas Richiardi