Modeling and Inferring Attention between Humans or for Human-Robot Interactions

Rémy Alain Siegfried
2021
Thèse EPFL

Résumé

More and more intelligent systems have to interact with humans. In order to communicate efficiently, these systems need to perceive and understand us. A key factor of communication is the people's visual focus of attention (VFOA), which is useful to estimate addressees and engagement among others. It is also strongly related to the gaze, its continuous counterpart, whose analysis allows to estimate high-level features, e.g. confidence and tiredness. Beyond communication, interesting statistics can be derived from eye movements themselves. For instance, fixation duration and blink rate were shown to be related to mental health. Thus, VFOA, gaze, and eye movements estimation have great potential in a wide range of fields, like human behaviors analysis, human-computer interactions, psychiatric diagnosis, and so on. However, despite recent improvements in sensors and methods, the precise tracking of people's gaze and VFOA remains difficult without using intrusive sensors or constraining people's movement, which do not suit applications where users behaving naturally are recorded by remote sensors, like in many human-robot interactions or psychological studies, for example.This thesis introduces new approaches to improve gaze and VFOA estimation from videos recorded by remote sensors which could be embedded on robots or be part of a room monitoring system for example. It proposes an unsupervised and online method to calibrate gaze trackers from attention priors used in conversation and manipulation, removing the need for a dedicated calibration session. Also, it proposes a method to estimate the VFOA in setups with an arbitrary and dynamic number of visual targets by using a fixed-sized representing for all the subject and context-related features. Finally, it proposes an eye movements recognition to detect saccades and blinks in eye image sequences without the need for accurate gaze traces. These approaches were validated through experiments on human conversation and object manipulation recordings. Overall, by focusing on the improvement of VFOA and gaze estimation usability, this thesis attempts to make a step toward the democratization of these methods and their application to weakly constrained setups relying on cheap sensing devices.

Source officielle

https://infoscience.epfl.ch/record/290060?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Modeling and Inferring Attention between Humans or for Human-Robot Interactions

Graph Chatbot

Chattez avec Graph Search

SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual Attention

Investigating neural resource allocation in the sensorimotor control of extra limbs

A Biohybrid Superorganism - Investigating honeybees' collective behaviors via interactive robotics

SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual Attention

Investigating neural resource allocation in the sensorimotor control of extra limbs

A Biohybrid Superorganism - Investigating honeybees' collective behaviors via interactive robotics