HomeLREC 2026WorkshopsREADIXTSARlrec2026-ws-readixtsar-08
Back to READIXTSAR 2026
LREC 2026workshop

A Comparative Study of Multilingual Fine-tuning and Prompting for Automatic Text Readability Classification in Galician

Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026

DOI:10.63317/4tnwhe3r9579

Abstract

Despite advancements in automatic readability assessment, low-resource languages such as Galician remain under-explored. This study addresses this gap by presenting a comparative study of readability assessment techniques in Galician, including fine-tuning of encoder models as well as prompting strategies using large generative models. Due to the scarcity of native Galician resources, neural machine translation was employed to generate synthetic Galician data. The analysis begins with BERT-based monolingual models trained on the synthetic data. For multilingual models, the impact of using original versus translated data was compared in order to assess the effects of translation-based augmentation. Finally, several LLMs were evaluated using zero-shot and few-shot prompting methods. The results indicate that generative models are not yet competitive with encoder models tuned for text classification in Galician, and that data generated through machine translation improves the performance of monolingual models but has little effect on multilingual models.

Details

Paper ID
lrec2026-ws-readixtsar-08
Pages
pp. 101-120
BibKey
rodrguezrey-etal-2026-comparative
Editors
Matthew Shardlow, Thomas François, Raquel Amaro, Jorge Baptista, Rémi Cardon, Eugénio Ribeiro, Horacio Saggion, Regina Stodden, Amalia Todirascu, Rodrigo Wilkens
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • SR

    Sandra Rodríguez Rey

  • MG

    Marcos Garcia

Links