Back to Main Conference 2026
LREC 2026main

More than "Oh": Grounding Observable Events with Grunts in Multimodal Dialogue

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/43opy9tsucf3

Abstract

Conversational grunts (minimal vocalizations like oh, mm-hm, and uh-huh) ground information and coordinate understanding in human dialogue, yet computational systems typically treat them as noise rather than meaningful communicative acts. We present a systematic annotation and analysis of 497 grunts across 3 hours of multimodal collaborative tasks, introducing an annotation scheme that captures grunts, their antecedents, and dialogue act functions. Our analysis reveals that grunts respond to speech and observable events at nearly equal rates, demonstrating that non-verbal events function as conversational contributions requiring acknowledgment. Tokens exhibit functional specialization: mm-hm predominantly acknowledges speech, while oh preferentially acknowledges events. Prosodic analysis shows speakers systematically modulate duration and pitch based on antecedent type, with event responses typically longer and having greater range. These findings have implications for dialogue state tracking, multimodal grounding, and turn-taking in conversational AI systems.

Details

Paper ID
lrec2026-main-132
Pages
pp. 1677-1687
BibKey
brutti-etal-2026-more
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • RB

    Richard A. Brutti

  • JP

    James Pustejovsky

Links