Back to Main Conference 2026
LREC 2026main

PHEB: An European Portuguese High School-Level LLM Benchmark

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2o3fvueefvwj

Abstract

We present PHEB, a comprehensive benchmark designed to evaluate Large Language Models (LLMs) on real high school level national exams in European Portuguese. The goal is to promote the development of NLP tools and provide a reliable resource for benchmarking multilingual and educational capabilities of LLMs. Covering over 3,500 questions spanning 18 years (2006–2023) across six core subjects, the benchmark compiles high-quality questions from Portuguese National Exams, written and thoroughly curated by professors to ensure topic diversity, linguistic accuracy, and alignment with national curricula. PHEB spans a wide range of subjects, including Mathematics, Portuguese Language and Literature, History, Geography, Biology/Geology, and Philosophy. Questions incorporate both multiple-choice and long-form answers to assess factual knowledge, reasoning capabilities, and language understanding. We comprehensively benchmark state-of-the-art LLMs, shedding light on key challenges such as models’ knowledge, language coverage, answer format biases and robustness to machine translation.

Details

Paper ID
lrec2026-main-367
Pages
pp. 4673-4683
BibKey
tavares-etal-2026-pheb
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • DT

    Diogo C. Tavares

  • RF

    Rafael Ferreira

  • AS

    Afonso Simplício

  • GV

    Gonçalo Vinagre

  • AC

    Ana Carolina Condez

  • IC

    Inês Calvo

  • IV

    Inês Vieira

  • DS

    David Semedo

  • JM

    Joao Magalhaes

Links