HomeLREC 2026WorkshopsREADIXTSARlrec2026-ws-readixtsar-14
Back to READIXTSAR 2026
LREC 2026workshop

A Learner-Oriented Annotated Resource of French Multiword Expressions for Text Adaptation in Foreign Language Reading

Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026

DOI:10.63317/47m6wxhoqpej

Abstract

This article presents a learner-oriented annotated lexical resource of French multiword expressions (MWEs) designed to support text adaptation in foreign language reading. MWEs, including idioms and collocations, pose major comprehension challenges for learners because their meaning often cannot be inferred compositionally or depends on conventional lexical constraints. To address this issue, the study extends the existing verbal MWE database by integrating nominal and verbal MWEs annotated according to a linguistically grounded typology distinguishing idioms, opaque collocations, and transparent collocations. The resource was developed through a multi-step methodology combining automatic extraction from pedagogical corpora, manual annotation using decision-tree-based guidelines, and CEFR level assignment based on corpus distribution. The resulting dataset includes approximately 2,700 expressions enriched with detailed linguistic and learner-relevant metadata. Annotation campaigns involving native and non-native annotators showed moderate agreement, reflecting the gradient nature of phraseological opacity. By linking phraseological complexity with learner proficiency, this resource provides a reproducible framework for modeling MWE difficulty. It offers valuable support for text adaptation, readability assessment, and the development of NLP-based educational tools, contributing to improved accessibility of French texts for language learners.

Details

Paper ID
lrec2026-ws-readixtsar-14
Pages
pp. 181-192
BibKey
kalinina-etal-2026-learner
Editors
Matthew Shardlow, Thomas François, Raquel Amaro, Jorge Baptista, Rémi Cardon, Eugénio Ribeiro, Horacio Saggion, Regina Stodden, Amalia Todirascu, Rodrigo Wilkens
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • AK

    Anna Kalinina

  • TF

    Thomas François

  • HV

    Hélène Vassiliadou

  • AT

    Amalia Todirascu

Links