A Calibrated and Interpretable Framework for Multilingual Text Difficulty Prediction

Proceedings of the 2nd Workshop on Evaluating Text Difficulty in a Multilingual Context (DeTermIt! 2026)

Abstract

Text difficulty prediction in educational contexts requires models that balance predictive performance, interpretability, calibration, and pedagogical alignment. While transformer-based approaches increasingly dominate text difficulty classification, educational applications demand transparent and linguistically grounded modeling. This paper presents work aimed at developing a workbench for CEFR-based text difficulty prediction. The proposed platform comprises three main components: (i) a tool for CEFR-aligned dataset preparation incorporating a pipeline for documenting, processing, and enriching textual data, (ii) CEFR-aligned datasets, and (iii) three alternative modeling approaches, namely a rule-based baseline, a feature-based Machine Learning (ML) classifier, and a fine-tuned BERT model. Our approach integrates linguistically informed feature engineering with data-driven modeling techniques, thereby balancing transparency and predictive performance. The proposed workbench has been designed as a language-agnostic infrastructure that can be extended to any language. In its current implementation, it has been applied to the creation of a German CEFR dataset, while its Greek counterpart is currently under development.