We have all been one such student or seen such students who can maintain the 'good student' image while playing a video game under the table or those loyal backbenchers, seemingly always distracted, who then ace their exams. These intricacies of human behaviors are just a few examples of what makes it non-trivial and challenging even for expert teachers to know how students' visible behaviors relate with learning. As research investigates ways in which robots and AI can support teachers and students, it is faced with the same challenge of inferring students' engagement; thus, making the investigation of this topic increasingly popular in educational HRI. The state of the art usually explores the relationship between the robot behaviors and the engagement state of the learner while assuming a linear relationship between engagement and learning. However, is it correct to assume that to maximize learning, one needs to maximize engagement? Furthermore, conventional supervised engagement models require human annotators to get labels. This not only is laborious but can also introduce subjectivity. Can we have machine-learning engagement models where annotations do not rely on human annotators? Additionally, with the increase in open-ended learning activities which by design employ the 'learning by failing' paradigm, in-task performance can not be the best measure for learning. Can we instead rely on multi-modal behaviors? In an effort to cater for these challenges, this thesis dives deep to identify and quantify the relationship between learning and engagement, which we term as Productive Engagement (PE). In order to develop, design, and evaluate our PE framework, (1) we first designed and developed an open-ended collaborative learning activity that served as a platform for evaluating different robot variants over time. With 98 children interacting with the baseline version from 2 international Swiss schools, we showed that in-task performance and learning are indeed not correlated. Thus, this showed the importance of not being limited to robot interventions that affect only superficial measures of students' learning. (2) Then, with learner's multi-modal behaviors, we showed that indeed there is a hidden link between learner's behaviors and learning that can be quantified, i.e., validating the proposed concept of Productive Engagement. (3) This quantifiable link surfaced three collaborative multi-modal learner profiles, by using a forward and backward clustering and classification technique, two of which are linked to higher learning. This technique gave a possibility to surface data driven labels for engagement; thus, evading the process of human annotations. We then identified similarities and differences between these learner profiles both at an aggregate and at the temporal level. (4) Based on (3), we constructed a PE score that can either be directly used as an assessment metric by a social robot in real-time or as data driven labels for building more sophisticated regression models. (5) With the learner profiles and the PE score, we designed and evaluated more advanced robot variants for the final studies with ~160 students from 7 international Swiss schools. With the design of different robot variants that employ knowledge about the learner's skills conducive to learning, rather than domain knowledge, in order to provide interventions; we provided a complementary perspective on the role of social robots in educational settings.
EPFL2022