Med2Story Referential: A Domain-Specific Extension of ISO 24617-9 for Clinical Narratives Annotation
Proceedings of the 22nd Joint ACL - ISO Workshop on Interoperable Semantic Annotation and Representation (ISA-22) @ LREC 2026
Abstract
The semantic annotation of clinical narratives is particularly challenging due to the complexity of medical discourse and the need to integrate linguistic, semantic, and domain-specific information within a unified framework. Existing schemes tend to fall into two categories: general-purpose frameworks, which offer robust linguistic modelling but lack specialised medical representation, and domain-specific schemes, which capture clinical content yet often fail to distinguish fundamental semantic types, especially eventive expressions and referential entities. To address this gap, this study proposes Med2Story Referential, a new extension of the Text2Story annotation scheme (Silvano et al., 2021; Leal et al., 2022) (based on ISO 24617-9: 2019 ) dedicated to referential entities in clinical narratives. Building on previous work that introduced a specialised branch for eventive entities (Fernandes et al., 2025a), and informed by the UMLS Metathesaurus and expert validation from a consultant haematologist, the extension introduces eight referential categories that refine the representation of clinical actors, substances, biological entities, instruments, and documentation. The results show that ISO 24617-9: 2019 can be applied to this type of text; however, several adaptations are required, particularly with regard to the grammatical domain and the inclusion of specialised domain labels. Nonetheless, the annotation experiment conducted to validate our proposal showed that the annotation scheme and its accompanying guidelines enable a comprehensive and detailed representation of both grammatical and medical aspects. Moreover, the results indicate that the scheme can be applied effectively by annotators without medical expertise.