Back to Main Conference 2026
LREC 2026main

LuxBorrow: From Pompier to Pompjee, Tracing Borrowing in Luxembourgish

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/38pbv3g6swmm

Abstract

We present LuxBorrow, a borrowing-first analysis of Luxembourgish (LU) news spanning 27 years (1999–2025): 259,305 RTL articles and 43.7M tokens. Our pipeline combines sentence-level language identification (LU/DE/FR/EN) with a token-level borrowing resolver restricted to LU sentences, using lemmatization, a collected loanword registry, and compiled morphological/orthographic rules. Empirically, LU remains the matrix language across all documents, while multilingual practice is pervasive: 77.1% of articles include at least one donor language and 65.4% use three or four. Breadth does not imply intensity: median code-mixing index (CMI) increases from 3.90 (LU+1) to only 7.00 (LU+3), indicating localized insertions rather than balanced bilingual text. Domain/period summaries show moderate but persistent mixing, with CMI rising from 6.1 (1999–2007) to a peak of 8.4 (2020). Token-level adaptations total 25,444 instances and exhibit a mixed profile: morphological 63.8%, orthographic 35.9%, lexical 0.3%; the most frequent single rules are orthographic (on→oun, eur→er), while morphology is collectively dominant. Diachronically, code-switching intensifies, and morphologically adapted borrowings grow from a small base; French overwhelmingly supplies adapted items, with modest growth for German and negligible English. We advocate borrowing-centric evaluation, borrowed token/type rates, donor entropy over borrowed items, and assimilation ratios over headline document-level mixing indices.

Details

Paper ID
lrec2026-main-249
Pages
pp. 3171-3183
BibKey
hosseinikivanani-etal-2026-luxborrow
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • NH

    Nina Hosseini-Kivanani

  • FP

    Fred Philippy

Links