Back to Main Conference 2026
LREC 2026main

Temporal Expression Recognition in Legal Transcripts

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/5n7bd6gxobss

Abstract

Before working with clinical text data, it is critical and necessary to blind, remove or substitute any personal information in clinical reports. This information may contain named entities, contact details and biographical information, all of which could lead to direct conclusions about an individual. However, there are certain scenarios in which clinical documentation cannot be anonymized, such as when it concerns a rare disease. These records contain information such as mentions of genetic peculiarities or the name of the treating physician. At first glance, this information does not appear to allow conclusions to be drawn about individuals, but it can. In this paper, we address the task of predicting whether a medical report (or a sentence therein) refers to a rare disease or not. Records of rare diseases may contain references to relatives and certain indications that can help reveal whether a rare disease is present. We design a pattern-based approach and a TF-IDF-based predictor, as well as two supervised learning experiments (one at document level and one at sentence level), achieving an F1-score of up to 98%. Our research is the first step towards a larger endeavor in which we aim to support experts involved in documenting medical narratives of rare diseases with automated processes.

Details

Paper ID
lrec2026-main-478
Pages
pp. 6022-6037
BibKey
goldstein-etal-2026-temporal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • EG

    Elizabeth J. Goldstein

  • MB

    Maria Berger

Links