Can LLMs Control Readability? A Multi-Dimensional Evaluation Framework for CEFR-Controlled Arabic Generation

Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026

Abstract

While Large Language Models (LLMs) can generate fluent Arabic text, their ability to reliably control readability levels remains unclear. We propose a multi-dimensional evaluation framework for Common European Framework of Reference for Language (CEFR)-controlled Arabic text generation, assessing whether instruction-following LLMs can serve as reliable generators for adaptive language learning. Our framework integrates controlled prompting, automatic readability prediction using a validated Taha-19 model, lexical constraint validation, and syntactic complexity profiling. Results show that structured prompting substantially improves CEFR alignment. In particular, CEFR-guided prompting with lexical constraints achieves the highest conformity to reference linguistic profiles (0.91 cosine similarity) and near-perfect agreement with predicted readability levels (0.99), while unconstrained prompting exhibits weak control. These findings establish an empirical foundation for integrating readability-aware Arabic text generation into adaptive educational systems.

Resources

Details

Paper ID

lrec2026-ws-readixtsar-06

Pages

pp. 74-88

DOI

10.63317/48v7mxywgfja

BibKey

rabih-etal-2026-can

Editors

Matthew Shardlow, Thomas François, Raquel Amaro, Jorge Baptista, Rémi Cardon, Eugénio Ribeiro, Horacio Saggion, Regina Stodden, Amalia Todirascu, Rodrigo Wilkens

Publisher

European Language Resources Association (ELRA)

ISSN

N/A

ISBN

N/A

Workshop

Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026

Location

Palma, Mallorca, Spain

Date

11 - 16 May 2026

Authors

NR
Nour Rabih
CQ
Chatrine Qwaider
TB
Ted Briscoe

Links

URL

DOI