Publication

Computational Modeling of Face-to-Face Social Interaction Using Nonverbal Behavioral Cues

Dinesh Babu Jayagopi
2011
EPFL thesis
Abstract

The computational modeling of face-to-face interactions using nonverbal behavioral cues is an emerging and relevant problem in social computing. Studying face-to-face interactions in small groups helps in understanding the basic processes of individual and group behavior; and improving team productivity and satisfaction in the modern workplace. Apart from the verbal channel, nonverbal behavioral cues form a rich communication channel through which people infer – often automatically and unconsciously – emotions, relationships, and traits of fellowmembers. There exists a solid body of knowledge about small groups and the multimodal nature of the nonverbal phenomenon in social psychology and nonverbal communication. However, the problem has only recently begun to be studied in the multimodal processing community. A recent trend is to analyze these interactions in the context of face-to-face group conversations, using multiple sensors and make inferences automatically without the need of a human expert. These problems can be formulated in a machine learning framework involving the extraction of relevant audio, video features and the design of supervised or unsupervised learning models. While attempting to bridge social psychology, perception, and machine learning, certain factors have to be considered. Firstly, various group conversation patterns emerge at different time-scales. For example, turn-taking patterns evolve over shorter time scales, whereas dominance or group-interest trends get established over larger time scales. Secondly, a set of audio and visual cues that are not only relevant but also robustly computable need to be chosen. Thirdly, unlike typical machine learning problems where ground truth is well defined, interaction modeling involves data annotation that needs to factor in inter-annotator variability. Finally, principled ways of integrating the multimodal cues have to be investigated. In the thesis, we have investigated individual social constructs in small groups like dominance and status (two facets of the so-called vertical dimension of social relations). In the first part of this work, we have investigated how dominance perceived by external observers can be estimated by different nonverbal audio and video cues, and affected by annotator variability, the estimationmethod, and the exact task involved. In the second part, we jointly study perceived dominance and role-based status to understand whether dominant people are the ones with high status and whether dominance and status in small-group conversations be automatically explained by the same nonverbal cues. We employ speaking activity, visual activity, and visual attention cues for both the works. In the second part of the thesis, we have investigated group social constructs using both supervised and unsupervised approaches. We first propose a novel framework to characterize groups. The two-layer framework consists of a individual layer and the group layer. At the individual layer, the floor-occupation patterns of the individuals are captured. At the group layer, the identity information of the individuals is not used. We define group cues by aggregating individual cues over time and person, and use them to classify group conversational contexts – cooperative vs competitive and brainstorming vs decision-making. We then propose a framework to discover group interaction patterns using probabilistic topicmodels. An objective evaluation of ourmethodology involving human judgment and multiple annotators, showed that the learned topics indeed are meaningful, and also that the discovered patterns resemble prototypical leadership styles – autocratic, participative, and free-rein – proposed in social psychology.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (45)
Nonverbal communication
Nonverbal communication (NVC) is the transmission of messages or signals through a nonverbal platform such as eye contact, facial expressions, gestures, posture, use of objects and body language. It includes the use of social cues, kinesics, distance (proxemics) and physical environments/appearance, of voice (paralanguage) and of touch (haptics). A signal has three different parts to it, including the basic signal, what the signal is trying to convey, and how it is interpreted.
Machine learning
Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.
Sensory cue
A sensory cue is a statistic or signal that can be extracted from the sensory input by a perceiver, that indicates the state of some property of the world that the perceiver is interested in perceiving. A cue is some organization of the data present in the signal which allows for meaningful extrapolation. For example, sensory cues include visual cues, auditory cues, haptic cues, olfactory cues and environmental cues. Sensory cues are a fundamental part of theories of perception, especially theories of appearance (how things look).
Show more
Related publications (52)

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Arnout Jan J Devos

Machine learning (ML) enables artificial intelligent (AI) agents to learn autonomously from data obtained from their environment to perform tasks. Modern ML systems have proven to be extremely effective, reaching or even exceeding human intelligence.Althou ...
EPFL2024

Robust machine learning for neuroscientific inference

Steffen Schneider

Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...
EPFL2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...
EPFL2024
Show more
Related MOOCs (16)
Information, Calcul, Communication: Introduction à la pensée informatique
Dans une première partie, nous étudierons d’abord comment résoudre de manière très concrète un problème au moyen d’un algorithme, ce qui nous amènera dans un second temps à une des grandes questions d
Information, Calcul, Communication: Introduction à la pensée informatique
Dans une première partie, nous étudierons d’abord comment résoudre de manière très concrète un problème au moyen d’un algorithme, ce qui nous amènera dans un second temps à une des grandes questions d
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.