This paper addresses the recognition of people’s visual focus of attention (VFOA), the discrete version of gaze indicating who is looking at whom or what. In absence of high def- inition images, we rely on people’s head pose to recognize the VFOA. To the contrary of most previous works that assumed a fixed mapping between head pose directions and gaze target directions, we investigate novel gaze models doc- umented in psychovision that produce a dynamic (temporal) mapping between them. This mapping accounts for two im- portant factors affecting the head and gaze relationship: the shoulder orientation defining the gaze midline of a person varies over time; and gaze shifts from frontal to the side in- volve different head rotations than the reverse. Evaluated on a public dataset and on data recorded with the humanoid robot Nao, the method exhibit better adaptivity often pro- ducing better performance than state-of-the-art approach.
Jean-Marc Odobez, Rémy Alain Siegfried