Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...
This paper proposes an application of information theoretic approach for finding the most informative subset of eigenfeatures to be used for audio-visual speech recognition tasks. The state-of-the-art visual feature extraction methods in the area of speech ...
In this paper we consider the problem of automatic extraction of the geometric lip features for the purposes of multi-modal speaker identification. The use of visual information from the mouth region can be of great importance for improving the speaker ide ...
In this paper we aim to explore what is the most appropriate number of data samples needed when measuring the temporal correspondence between a chosen set of video and audio cues in a given audio-visual sequence. Presently the optimal model that connects s ...