Eating meals together is one of the most frequent human social experiences. When eating in the company of others, we talk, joke, laugh, and celebrate. In the paper, we focus on commensal activities, i.e., the actions related to food consumption (e.g., food chewing, in-taking) and the social signals (e.g., smiling, speaking, gazing) that appear during shared meals. We analyze the social interactions in a commensal setting and provide a baseline model for automatically recognizing such commensal activities from video recordings. More in detail, starting from a video dataset containing pairs of individuals having a meal remotely using a video-conferencing tool, we manually annotate commensal activities. We also compute several metrics, such as the number of reciprocal smiles, mutual gazes, etc., to estimate the quality of social interactions in this dataset. Next, we extract the participants' facial activity information, and we use it to train standard classifiers (Support Vector Machines and Random Forests). Four activities are classified: chewing, speaking, food in-taking, and smiling. We apply our approach to more than 3 hours of videos collected from 18 subjects. We conclude the paper by discussing possible applications of this research in the field of Human-Agent Interaction.
Towards Commensal Activities Recognition / Niewiadomski, R.; De Lucia, G.; Grazzi, G.; Mancini, M.. - (2022), pp. 549-557. (Intervento presentato al convegno 24th ACM International Conference on Multimodal Interaction, ICMI 2022 tenutosi a Bangalore) [10.1145/3536221.3556566].
Towards Commensal Activities Recognition
Mancini M.
2022
Abstract
Eating meals together is one of the most frequent human social experiences. When eating in the company of others, we talk, joke, laugh, and celebrate. In the paper, we focus on commensal activities, i.e., the actions related to food consumption (e.g., food chewing, in-taking) and the social signals (e.g., smiling, speaking, gazing) that appear during shared meals. We analyze the social interactions in a commensal setting and provide a baseline model for automatically recognizing such commensal activities from video recordings. More in detail, starting from a video dataset containing pairs of individuals having a meal remotely using a video-conferencing tool, we manually annotate commensal activities. We also compute several metrics, such as the number of reciprocal smiles, mutual gazes, etc., to estimate the quality of social interactions in this dataset. Next, we extract the participants' facial activity information, and we use it to train standard classifiers (Support Vector Machines and Random Forests). Four activities are classified: chewing, speaking, food in-taking, and smiling. We apply our approach to more than 3 hours of videos collected from 18 subjects. We conclude the paper by discussing possible applications of this research in the field of Human-Agent Interaction.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.