Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In this paper, we propose a method based on Bayesian networks for interpretation of multimodal signals used in the spoken dialogue between a tour-guide robot and visitors in mass exhibition conditions. We report on experiments interpreting speech and laser scanner signals in the dialogue management system of the autonomous tour-guide robot RoboX, successfully deployed at the Swiss National Exhibition (Expo.02). A correct interpretation of a users (visitors) goal or intention at each dialogue state is a key issue for successful voice-enabled communication between tour-guide robots and visitors. To infer the visitors goal under the uncertainty intrinsic to these two modalities, we introduce Bayesian networks for combining noisy speech recognition with data from a laser scanner, which is independent of acoustic noise. Experiments with real data, collected during the operation of RoboX at Expo.02 demonstrate the effectiveness of the approach.
Yves Bellouard, Tao Yang, Pieter Vlugter, Enrico Casamenti
Volkan Cevher, Paul Thierry Yves Rolland, Jonathan Mark Scarlett, Ilija Bogunovic