HomeLREC 2026WorkshopsNSLPlrec2026-ws-nslp-03
Back to NSLP 2026
LREC 2026workshop

Benchmarking Retrieval-Augmented Generation for Scientific Knowledge QA in European Portuguese

Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026

DOI:10.63317/3muergicuxwk

Abstract

Retrieval-Augmented Generation (RAG) enables grounding of model outputs in external evidence, but its impact on European Portuguese (pt-PT) scientific question answering (QA) remains unclear. We present a controlled evaluation of RAG on pt-PT knowledge QA across different scientific domains using the Portuguese test split of the Global MMLU Lite dataset. As external evidence, we use a Portuguese scientific literature knowledge base containing over 32,000 documents converted to Markdown. We benchmark five instruction-tuned small language models (4-12B) and compare closed-book baselines against 16 RAG configurations that vary by: (i) dense retriever specialization (multilingual vs. Portuguese-specific), (ii) reranking (on/off), and (iii) number of retrieved chunks (k ∈ 1, 3, 5, 10). Results suggest that RAG gains are model-dependent. Some models improve consistently, others are highly sensitive to retrieval choices, and some degrade under retrieval noise, especially at larger values of k. Findings highlight the importance of model-specific retrieval tuning and ensuring that the retriever and reranker languages and domains align when deploying RAG systems for Portuguese natural scientific language processing.

Details

Paper ID
lrec2026-ws-nslp-03
Pages
pp. 25-31
BibKey
matos-etal-2026-benchmarking
Editors
Georg Rehm, Stefan Dietze, Danilo Dessi, Diana Maynard, Sonja Schimmler
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • JM

    Jose Matos

  • CS

    Catarina Silva

  • HG

    Hugo Goncalo Oliveira

Links