Emergent leaders through looking and speaking: from audio-visual data to multimodal recognition
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Communication between humans deeply relies on the capability of expressing and recognizing feelings. For this reason, research on human-machine interaction needs to focus on the recognition and simulation of emotional states, prerequisite of which is the c ...
Since the prosody of a spoken utterance carries information about its discourse function, salience, and speaker attitude, prosody mod- els and prosody generation modules have played a crucial part in text-to- speech (TTS) synthesis systems from the beginni ...
The amount of visual feedback when using a mobile device in a busy context is often limited. For example, while texting and walking in a crowded place, we need to focus on the environment and not on the phone. A way to type fast, accurately and with limite ...
Understanding the principles involved in visually-based coordinated motor control is one of the most fundamental and most intriguing research problems across a number of areas, including psychology, neuroscience, computer vision and robotics. Not very much ...
The human brain analyzes a visual object first by basic feature detectors. These features are integrated in subsequent stages of the visual hierarchy. Generally it is assumed that the information about these basic features is lost once the information is s ...
As multimodal data becomes easier to record and store, the question arises as to what practical use can be made of archived corpora, and in particular what tools allowing efficient access to it can be built. We use the AMI Meeting Corpus as a case study to ...
We present a novel, biologically inspired, approach to an efficient allocation of visual resources for humanoid robots in a form of a motor-primed visual attentional landscape. The attentional landscape is a more general, dynamic and a more complex concept ...
Vision is dynamic. After their onset, visual stimuli undergo a complex cascade of processes before awareness is reached. Even after more than a century of research, the timing of these processes is still largely unknown. In particular, how the brain determ ...
In this article, we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We focus on overcoming the effects of a changing pose of the speaker, a problem encountered in natural situations where the speak ...
A non-obtrusive portable device, wearable from infancy through adulthood, mounted with i) a set of two or more optical device(s) providing visual and audio information as perceived by the user ii) an actuated mirror or optical device returning visual infor ...