Publication

Estimating Dominance in Multi-Party Meetings Using Speaker Diarization

Daniel Gatica-Perez, Yan Huang
2010
Journal paper

Abstract

With the increase in cheap commercially available sensors, recording meetings is becoming an increasingly practical option. With this trend comes the need to summarize the recorded data in semantically meaningful ways. Here, we investigate the task of automatically measuring dominance in small group meetings when only a single audio source is available. Past research has found that speaking length as a single feature, provides a very good estimate of dominance. For these tasks we use speaker segmentations generated by our automated faster than real-time speaker diarization algorithm, where the number of speakers is not known beforehand. From user-annotated data, we analyze how the inherent variability of the annotations affects the performance of our dominance estimation method. We primarily focus on examining of how the performance of the speaker diarization and our dominance tasks vary under different experimental conditions and computationally efficient strategies, and how this would impact on a practical implementation of such a system. Despite the use of a state-of-the-art speaker diarization algorithm, speaker segments can be noisy. On conducting experiments on almost 5 hours of audio-visual meeting data, our results show that the dominance estimation is robust to increasing diarization noise.

Official source

https://infoscience.epfl.ch/record/150595?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Estimating Dominance in Multi-Party Meetings Using Speaker Diarization

Graph Chatbot

Chat with Graph Search

Multisensory haptic system and method

SYSTEM FUSION AND SPEAKER LINKING FOR LONGITUDINAL DIARIZATION OF TV SHOWS

On dynamic stream weighting for Audio-Visual Speech Recognition

Multisensory haptic system and method

SYSTEM FUSION AND SPEAKER LINKING FOR LONGITUDINAL DIARIZATION OF TV SHOWS

On dynamic stream weighting for Audio-Visual Speech Recognition