Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper presents a new database containing high-definition audio and video recordings in a rather unconstrained video-conferencing-like environment. The database consists of recordings of people sitting around a table in two separate rooms communicating and playing online games with each other. Extensive annotation of head positions, voice activity and word transcription has been performed on the dataset, making it especially useful for evaluating automatic speech-recognition, voice activity detection, speaker localisation, multi-face detection and tracking, and other audio-visual analysis algorithms.
Anastasia Ailamaki, Georgios Psaropoulos
,
, ,