Visual behavior, and specifically gaze directed at objects and people, is a fundamental cue for understanding how children with Autism Spectrum Disorder (ASD) experience and respond to social interaction. Indeed atypical behaviors such as averting the gaze from faces, looking out of the corner of the eyes, and having difficulties disengaging from non-social stimuli, are well known symptoms of this developmental disorder. However, studying these atypicalities presents several technical challenges, both at a hardware and software level. Traditionally this type of analysis is done by viewing video recordings of the child's interaction and manually rating gaze behavior episodes. This data collection procedure is often a time consuming process, and all the results have to be checked by multiple raters to ensure reliability. When automated systems are used, issues of intrusiveness, robustness and reliability have to be taken into consideration, and their impact on the behavior of the child needs to be accounted for. Moreover, children often do not cooperate to the same extent as an adult participant in the experimental process. This thesis addresses the problem of studying the visual behavior of children during dyadic interactions, using tools from machine learning and computer vision. One focus is on the technical aspects of how to measure the direction of the gaze, and how to recognize what the child is looking at. Another is on the development of semi-automated data analysis collection methods that can be efficiently checked and corrected by the experimenter. The concrete contribution of this work is a set of algorithms and software tools that allow firstly the measurement of multiple gaze factors and the quantification of attentional episodes to objects and people, and secondly the analysis of their distributions, extrema, and spatio-temporal correlations. These tools are applied to the specific case of children affected by ASD, which allows the assessment of how their gaze strategies differ from typically developing children. To measure the direction of the gaze of children in a robust and unobtrusive way, the main recording tool employed is the WearCam: a head-mounted camera developed in our laboratory, that records the eyes of the child and an image of the broad field of view. The reflected images of the eyes are mapped to the coordinates of the gaze in the field of view by means of data-driven appearance-based methods. This is a different approach to the standard methods for gaze tracking, as it makes no use of active lighting, and allows the gathering of data to be postponed until after the recording. This is especially useful as it relieves the children from actively participating in a calibration phase. The task of identifying at which object or person the child is looking is approached through a semi-automated analysis of the recordings. A computationally intensive face and object detection system provides a preliminary estimation, while a
Athanasios Nenes, Paraskevi Georgakaki
Andreas Mortensen, David Hernandez Escobar, Léa Deillon, Alejandra Inés Slagter, Eva Luisa Vogt, Jonathan Aristya Setyadji