Evaluating the Retrieval Component in a Retrieval-Augmented Summarization System for Patient Records in French

Proceedings of the 8th Workshop on Clinical Natural Language Processing (Clinical NLP) @ LREC 2026

Abstract

In emergency and intensive care settings, clinicians must process large volumes of patient data to make time-sensitive decisions. Summarizing patient records can help reduce cognitive load and improve decision-making, but the complexity and variability of clinical documentation create challenges. This study explores a Retrieval-Augmented Generation (RAG) approach, consisting of two phases: (1) retrieval of relevant clinical information, and (2) generation of a summary. This paper evaluates the retrieval component of RAG systems, focusing on its performance in clinical contexts. Using French clinical text, we assess retrieval models and propose an annotation-based querying method to improve accuracy and consistency in retrieving core clinical information. We use an annotated dataset from anonymized-hospital to benchmark retrieval models tailored for French clinical records. The proposed annotation-based querying method is compared to traditional prompt-based approaches, demonstrating improved retrieval performance. The findings indicate that specialized retrieval techniques enhance the effectiveness of RAG systems in clinical settings, providing more accurate and relevant information for summarization. The study contributes to the development of clinical decision support tools by improving the retrieval process in RAG systems. The proposed methods offer a structured approach to summarizing patient records, which may help clinicians manage information more efficiently.