Back to Main Conference 2026
LREC 2026main

IREKIER: An Easy Read Corpus for Basque and Spanish

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2e96m595cmfg

Abstract

Easy Read (ER) text adaptation is one of the main means to provide accessible content for people with reading difficulties. ER text features aspects of text simplification, along with specific characteristics such as the need for short sentences, clearly structured content, and explanations for complex concepts. Support for ER text generation is still lacking overall, with few available resources to build automated systems upon. In this work, we describe the IREKIER corpus, based on ER news in Basque and Spanish from the Irekia transparency portal of the Basque Government. This corpus is currently one of the largest publicly shared resource to support training and evaluation of ER text adaptation models in these two languages, and the first of its kind for Basque. We describe our methodology to create the resource, along with the specific challenges raised by ER text. We also provide both intrinsic and extrinsic evaluations of the corpus, which is shared with the scientific community under a CC-BY-NC-ND 4.0 license.

Details

Paper ID
lrec2026-main-128
Pages
pp. 1633-1649
BibKey
calleja-etal-2026-irekier
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • JC

    Jesús Calleja

  • TE

    Thierry Etchegoyhen

Links