HomeLREC 2026WorkshopsREADIXTSARlrec2026-ws-readixtsar-15
Back to READIXTSAR 2026
LREC 2026workshop

A Meta-evaluation of Automatic Metrics for Elaborative Simplification

Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026

DOI:10.63317/3bhnb2uoif7o

Abstract

Elaborative simplification aims to improve the readability of texts by adding content that helps the readers. However, evaluating these elaborations remains challenging due to their subjective nature and the lack of suitable annotated datasets. To support the evaluation of elaborative simplification models, we introduce a new dataset with human ratings of elaborations generated by Large Language Models (LLMs), focusing on two quality criteria: cohesion and informativeness. Using these human judgments as a reference, we conduct a meta-evaluation of existing automatic evaluation approaches, with a focus on LLM-as-a-judge strategies. Our experiments suggest that evaluations made by smaller LLMs correlate poorly with human judgments, while larger models with structured prompting exhibit higher agreement. Informativeness evaluation proved to be challenging due to its subjectivity, as evidenced by the low inter-annotator agreement compared to cohesion.

Details

Paper ID
lrec2026-ws-readixtsar-15
Pages
pp. 193-209
BibKey
alshatti-etal-2026-meta
Editors
Matthew Shardlow, Thomas François, Raquel Amaro, Jorge Baptista, Rémi Cardon, Eugénio Ribeiro, Horacio Saggion, Regina Stodden, Amalia Todirascu, Rodrigo Wilkens
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • AA

    Abdullah Alshatti

  • SS

    Steven Schockaert

  • FA

    Fernando Alva-Manchego

Links