In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In \cite{xavi_comb}, it is shown that TDOA can be used as additional features together with conventional spectral features for improving speaker diarization. We investigate here the combination of TDOA and spectral features in a fast diarization system based on the Information Bottleneck principle. We evaluate the algorithm on the NIST RT06 diarization task. Adding TDOA features to spectral features reduces the speaker error by 3% absolute. Results are comparable to those of conventional HMM/GMM based systems with consistent reduction in computational complexity.
Aude Billard, Mikhail Koptev, Nadia Barbara Figueroa Fernandez
Jean-Philippe Thiran, Tobias Kober, Bénédicte Marie Maréchal, Jonas Richiardi