Back to Main Conference 2026
LREC 2026main

Variation Is the Norm: Embracing Sociolinguistics in NLP

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3v3vhacxgfex

Abstract

In Natural Language Processing (NLP), variation is typically seen as noise and “normalised away” before processing, even though it is an integral part of language. Conversely, studying language variation in social contexts is central to sociolinguistics. We present a framework to combine the sociolinguistic dimension of language with the technical dimension of NLP. We argue that by embracing sociolinguistics, variation can actively be included in a research setup, in turn informing the NLP side. To illustrate this, we provide a case study on Luxembourgish, an evolving language featuring a large amount of orthographic variation, demonstrating how NLP performance is impacted. The results show large discrepancies in the performance of models tested and fine-tuned on data with a large amount of orthographic variation in comparison to data closer to the (orthographic) standard. Furthermore, we provide a possible solution to improve the performance by including variation in the fine-tuning process. This case study highlights the importance of including variation in the research setup, as models are currently not robust to occurring variation. Our framework facilitates the inclusion of variation in the thought-process while also being grounded in the theoretical framework of sociolinguistics.

Details

Paper ID
lrec2026-main-886
Pages
pp. 11332-11344
BibKey
lutgen-etal-2026-variation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • AL

    Anne-Marie Lutgen

  • AP

    Alistair Plum

  • VB

    Verena Blaschke

  • BP

    Barbara Plank

  • CP

    Christoph Purschke

Links