We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian Mixture Model (GMM) or a Hidden Markov Model (HMM) whose hidden states corresponds to the VFOA. The novelties of this work are threefold. First, contrary to previous studies on the topic, in our set-up, the potential VFOA of a person is not restricted to other participants only. It includes environmental targets as well (a table and a projection screen), which increases the complexity of the task, with more VFOA targets spread in the pan as well as tilt gaze space. Second, we propose a geometric model to set the GMM or HMM parameters by exploiting results from cognitive science on saccadic eye motion, which allows the prediction of the head pose given a gaze target. Third, an unsupervised parameter adaptation step not using any labeled data is proposed which accounts for the specific gazing behaviour of each participant.
Michael Herzog, Simona Adele Garobbio
,
Jean-Marc Odobez, Rémy Alain Siegfried