PE-HRI-temporal: A Multimodal Temporal Dataset in a robot mediated Collaborative Educational Setting

Please note that this dataset corresponds to the training data used in "Social robots as skilled ignorant peers for supporting learning "[7]. This (second) version of the dataset additionally includes labels (PE score and cluster labels for each datapoint).

This data set consists of multi-modal temporal team behaviors as well as learning outcomes collected in the context of a robot mediated collaborative and constructivist learning activity called JUSThink [1,2]. The data set can be useful for those looking to explore evolution of log actions, speech behavior, affective states, and gaze patterns for students to model constructs such as engagement, motivation, collaboration, etc. in educational settings.

In this data set, team level data is collected from 34 teams of two (68 children) where the children are aged between 9 and 12. There are two files:

PE-HRI_learning_and_performance.csv: This file consists of the team level performance and learning metrics which are defined below:

last_error: This is the error of the last submitted solution. Note that if a team has found an optimal solution (error = 0) the game stops, therefore making last error = 0. This is a metric for performance in the task.

T_LG_absolute: It is a team-level learning outcome that we calculate by taking the average of the two individual absolute learning gains of the team members. The individual absolute gain is the difference between a participant’s post-test and pre-test score, divided by the maximum score that can be achieved (10), which grasps how much the participant learned of all the knowledge available.

T_LG_relative: It is a team-level learning outcome that we calculate by taking the average of the two individual relative learning gains of the team members. The individual relative gain is the difference between a participant’s post-test and pre-test score, divided by the difference between the maximum score that can be achieved and the pre-test score. This grasps how much the participant learned of the knowledge that he/she didn’t possess before the activity.

T_LG_joint_abs: It is a team-level learning outcome defined as the difference between the number of questions that both of the team members answer correctly in the post-test and in the pre-test, which grasps the amount of knowledge acquired together by the team members during the activity

PE-HRI_behavioral_timeseries_w_labels.csv: In this file, for each team, the interaction of around 20-25 minutes is organized in windows of 10 seconds; hence, we have a total of 5048 windows of 10 seconds each. We report team level log actions, speech behavior, affective states, and gaze patterns for each window. More specifically, within each window, 26 features are generated in two ways:

non-incremental

incremental

A non-incremental type would mean the value of a feature in that particular time window while an incremental type would mean the value of a feature until that particular time win