A Dataset of Psychiatric Hospital Notes with Temporal Information Annotations
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Temporal information extraction is the task of identifying temporal entities in a text and relating them to each other. In medicine, electronic health records (EHRs) contain text that documents the sequence of events during an encounter with a patient, and sometimes the events prior to the encounter (e.g., social history). Temporality is especially important for the specialty of psychiatry. In this work, we describe the updates to the guidelines that allowed us to create a corpus of temporally-annotated psychiatric discharge summaries and progress notes. These updated guidelines were used to create a corpus of over 18000 events, 2200 time expressions, and 13,000 temporal relations. Temporal information extraction performance with a baseline system trained on non-psychiatric data obtains an F1 score of 0.152 on relation extraction, indicating the importance of this new dataset for making progress on temporal information extraction in the psychiatric domain.