CONVERSE: Annotation Scheme and Dataset for Multimodal Conversational Engagement Analysis in Human-Human and Human-Robot Interaction

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

Creating conversational agents that can both understand and respond appropriately to users’ engagement remains a major challenge, as conversation is one of the most universal yet complex human behaviors. Modeling conversational engagement requires a fine-grained understanding of how engagement unfolds dynamically in interaction. This paper introduces a novel turn-based annotation scheme for conversational engagement, together with the CONVERSE dataset that contains annotations of 25 hours of unscripted human–human and human–robot conversations with 48 native Swedish speakers. This dataset uniquely utilizes such an annotation scheme for both human and robot agents within the same study, allowing for direct comparison. Notably, this dataset builds upon our previous multimodal corpus, which includes brain imaging (fMRI), eye-tracking, and speech data, as well as personality and stance measures. This dataset opens a new perspective on conversational engagement through these behavioral annotations and the existing neural data at the intersection of multimodal machine learning, human-robot interaction, and cognitive neuroscience.