Missing Links: LLM-Augmentation of Event Triggers of State Changes in the OpenPI Dataset
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Effective computational understanding of procedural text requires modeling not just the state changes that occur (entity transformations), but also the specific actions that cause them (event triggers). A lack of datasets that explicitly link these two primary information sources has hindered progress in theory-oriented research and applications of NLP. This paper presents two primary contributions: (i) a new silver-standard dataset where event trigger annotations are added to existing state-change data on task-oriented procedural text, enabling both theoretical investigation and practical benchmarking; and (ii) inverse annotation, a framework for recovering missing linguistic annotations from existing semantic annotations—which we apply to recover event triggers from OpenPI’s state-change outcomes. We provide detailed pipeline analysis including error modes and quality filtering, and validate the dataset through comprehensive baseline evaluation of diverse trigger detection systems. Our work delivers both a reusable methodological framework applicable to other annotation recovery tasks and a new benchmark resource for modeling the relationship between linguistic actions and their semantic outcomes in procedural domains.