Visual Focus of Attention Estimation in 3D Scene with an Arbitrary Number of Targets

Visual Focus of Attention (VFOA) estimation in conversation is challenging as it relies on difficult to estimate information (gaze) combined with scene features like target positions and other contextual information (speaking status) allowing to disambiguate situations. Previous VFOA models fusing all these features are usually trained for a specific setup and using a fixed number of interacting people, and should be retrained to be applied to another one, which limits their usability. To address these limitations, we propose a novel deep learning method that encodes all input features as a fixed number of 2D maps, which makes the input more naturally processed by a convolutional neural network, provides scene normalization, and allows to consider an arbitrary number of targets. Experiments performed on two publicly available datasets demonstrate that the proposed method can be trained in a cross-dataset fashion without loss in VFOA accuracy compared to intra-dataset training.

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Visual Focus of Attention Estimation in 3D Scene with an Arbitrary Number of Targets

Graph Chatbot

Chattez avec Graph Search

Advancing Self-Supervised Deep Learning for 3D Scene Understanding

Predicting the long-term collective behaviour of fish pairs with deep learning

Data for Paper "Scalable Semantic 3D Mapping of Coral Reefs with Deep Learning"

Advancing Self-Supervised Deep Learning for 3D Scene Understanding

Predicting the long-term collective behaviour of fish pairs with deep learning

Data for Paper "Scalable Semantic 3D Mapping of Coral Reefs with Deep Learning"