Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Traditional speech recognition systems use Gaussian mixture models to obtain the likelihoods of individual phonemes, which are then used as state emission probabilities in hidden Markov models representing the words. In hybrid systems, the Gaussian mixture ...
We present a system that exploits advanced Virtual Reality technologies to create a surveillance and security system. Surveillance cameras are carried by a mini Blimp which is tele-operated using an innovative Virtual Reality interface with haptic feedback ...
Accurate detection, localization and tracking of multiple moving speakers permits a wide spectrum of applications. Techniques are required that are versatile, robust to environmental variations, and not constraining for non-technical end-users. Based on di ...
Accurate detection, localization and tracking of multiple moving speakers permits a wide spectrum of applications. Techniques are required that are versatile, robust to environmental variations, and not constraining for non-technical end-users. Based on di ...
Estimating the {\em wandering visual focus of attention} (WVFOA) for multiple people is an important problem with many applications in human behavior understanding. One such application, addressed in this paper, monitors the attention of passers-by to outd ...
Real world applications such as hands-free speech recognition of isolated digits may have to deal with potentially very noisy environments. Existing state-of-the-art solutions to this problem use feature-based HMMs, with a preprocessing stage to clean the ...
In this paper, we introduce probabilistic model based architecture for error handling in human–robot spoken dialogue systems under adverse audio conditions. In this architecture, a Bayesian network framework is used for interpretation of multi-modal signal ...
In this paper, we present a framework for predicting and correcting classification decision errors based on modality reliability measures in a multimodal biometric system. In our experiments we use face and speech experts based on a recently proposed frame ...
This paper presents a biologically inspired approach to multimodal integration and decision-making in the context of human-robot interactions. More specifically, we address the principle of ideomotor compatibility by which observing the movements of others ...
This paper proposes an application of information theoretic approach for finding the most informative subset of eigenfeatures to be used for audio-visual speech recognition tasks. The state-of-the-art visual feature extraction methods in the area of speech ...