Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
We present a method that exploits an information theoretic framework to extract optimal audio features with respect to the video features. A simple measure of mutual information between the resulting audio features and the video ones allows to detect the active speaker among different candidates. The results show that our method is able to exploit the shared speech information contained in audio and video signals to recover their common source.
Touradj Ebrahimi, Rayan Daod Nathoo, Laurent Deillon, Henrique Piñeiro Monteagudo
Daniel Maria Busiello, Giorgio Nicoletti