Bayesian Networks for Error Handling through Multimodality Fusion in Spoken Dialogues with Mobile Robots

In this paper, we introduce Bayesian networks architecture for combining speech-based information with that from another modality for error handling in human-robot dialogue system. In particular, we report on experiments interpreting speech and laser scanner signals in the dialogue management system of the autonomous tour-guide robot RoboX, successfully deployed at the Swiss National Exhibition (Expo.02). A correct interpretation of the users (visitors) goal or intention at each dialogue state under the uncertainty intrinsic to speech recognition accuracy is a key issue for successful voice-enabled communication between tour-guide robots and visitors. Bayesian networks are used to infer the goal of the user in presence of recognition errors, fusing speech recognition results along with information about the acoustic conditions and data from a laser scanner, which is independent of acoustic noise. Experiments with real-world data, collected during the operation of RoboX at Expo.02 demonstrate the effectiveness of the approach in adverse environment. The proposed architecture makes it possible to model error handling processes in spoken dialogue systems, which include complex combination of different multimodal information sources in cases where such information is available.

Bayesian Networks for Error Handling through Multimodality Fusion in Spoken Dialogues with Mobile Robots

Graph Chatbot

Chat with Graph Search

A Two-Step Approach To Leverage Contextual Data: Speech Recognition In Air-Traffic Communications

Beyond Point Clouds: Fisher Information Field for Active Visual Localization

Development of Bilingual ASR System for MediaParl Corpus

Development of Bilingual ASR System for MediaParl Corpus

A Two-Step Approach To Leverage Contextual Data: Speech Recognition In Air-Traffic Communications

Beyond Point Clouds: Fisher Information Field for Active Visual Localization