HomeLREC 2026WorkshopsCLINICALNLPlrec2026-ws-clinicalnlp-19
Back to CLINICALNLP 2026
LREC 2026workshop

Extracting Medication Instructions from Dutch General Practice Electronic Health Records with Local Natural Language Processing

Proceedings of the 8th Workshop on Clinical Natural Language Processing (Clinical NLP) @ LREC 2026

DOI:10.63317/3stt4uepqdnq

Abstract

The extraction of structured medication prescription data from unstructured clinical text remains a critical challenge for clinical research and data standardization. This study investigates the application of Natural Language Processing (NLP) techniques to Dutch electronic health records (EHRs) from the Julius General Practitioners Network. The goal is to automatically extract key prescription attributes including dosage, duration, and medication unit and prepare them for integration into the ConcePTION Common Data Model, to support scalable pharmacoepidemiological research. We compare a lightweight rule-based system with transformer-based models (RobBERT and MedRoBERTa) under the technical constraints of a Trusted Research Environment, where external resources and cloud-based solutions are restricted. Using a dataset of 1,819 manually annotated records, the approaches are evaluated on predictive performance and computational costs. Results show that the rule-based system achieves strong accuracy and computational costs for structured patterns, while transformer-based models demonstrate greater robustness to linguistic variability. However, both approaches encounter difficulties with ambiguous dosage formats and long treatment durations. Our findings indicate that NLP methods can substantially improve the structuring of Dutch prescription data and support scalable pharmacoepidemiological research. Future work should focus on improving generalization and expanding annotated datasets to enhance model reliability.

Details

Paper ID
lrec2026-ws-clinicalnlp-19
Pages
pp. 174-182
BibKey
dukmak-etal-2026-extracting
Editors
Asma Ben Abacha, Steven Bethard, Danielle Bitterman, Tristan Naumann, Kirk Roberts
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 8th Workshop on Clinical Natural Language Processing (Clinical NLP) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • MD

    Marya Dukmak

  • CA

    Constanza L. Andaur Navarro

  • AL

    Artuur Leeuwenberg

Links