Back to Main Conference 2026
LREC 2026main

ShAnEL-2: A Multilingual Benchmarking Dataset for Short-Answer Language Learning Exercises

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3cvfqh22muoo

Abstract

Before using GenAI models as EdTech tools, their pedagogical suitability should be corroborated. In this paper, we present ShAnEL-2, a novel multilingual dataset comprising 1,185 student responses to short-answer language learning exercises corrected by teachers. We use ShAnEL-2 to establish an initial benchmark of (1) "off-the-shelf" GenAI models and (2) retrieval-augmented generation (RAG) techniques for the automated correction of this exercise type. With an overall accuracy of 90% and recall of 95%, few-shot RAG (which adds previously corrected responses to the prompt) outperforms the off-the-shelf baseline and textbook RAG setup (which adds coursebook materials) by up to 7 (accuracy) and 5 (recall) percentage points. These results confirm that LLMs learn better from examples than from analysing context and highlight GenAI’s particular potential as a correction assistant for teachers.

Details

Paper ID
lrec2026-main-538
Pages
pp. 6764-6771
BibKey
degraeuwe-etal-2026-shanel
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • JD

    Jasper Degraeuwe

  • TM

    Thomas Moerman

Links