Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This work aims at investigating the automatic recognition of speaker role in meeting conversations from the AMI corpus. Two types of roles are considered: formal roles, fixed over the meeting duration and recognized at recording level, and social roles related to the way participants interact between themselves, recognized at speaker turn level. Various structural, lexical and prosodic features as well as Dialog Act tags are exhaustively investigated and combined for this purpose. Results reveal an accuracy of 74% in recognizing the speakers formal roles and an accuracy of 66% (percentage of time) in correctly labeling the social roles. Feature analysis reveals that lexical features provide the higher performances in formal/functional role recognition while prosodic features provide the higher performances in social role recognition. Furthermore results reveal that social role recognition in case of rare roles in the corpus can be improved through the use of lexical and Dialog Act information combined over short time windows.
Hervé Bourlard, Volkan Cevher, Afsaneh Asaei, Mohammadjavad Taghizadeh, Saeid Haghighatshoar