Back to Main Conference 2026
LREC 2026main

Unsupervised Labelling of Mutation Triggers in Welsh

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/37oxwc9pnyfv

Abstract

Initial consonant mutation is a key feature of Welsh, but its complexity poses significant challenges for both language learners and natural language processing (NLP) systems. While existing tools can reliably detect mutated forms, they provide no information about why a mutation occurs, i.e. what grammatical or lexical factors trigger the change. This paper introduces the novel task of mutation trigger labelling, representing the first computational attempt to analyse and explain the reasons behind Welsh mutations. Two preliminary approaches are explored: (i) a linguistically-informed rule-based system integrating Constraint Grammar rules, and (ii) large language models (LLMs), prompted in few-shot settings. Our experiments test the feasibility of automatically identifying and labelling linguistic triggers behind Welsh mutations using a dataset constructed from grammar reference books and public corpora, and establish baseline insights into how context-aware mutation analysis can be achieved. By framing mutation trigger labelling as a linguistic computational problem, this work lays important groundwork within Welsh NLP and contributes to the broader development of explainable grammatical analysis for low-resource languages.

Details

Paper ID
lrec2026-main-911
Pages
pp. 11631-11641
BibKey
gutirrezroln-etal-2026-unsupervised
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • NG

    Nicolás Gutiérrez-Rolón

  • FA

    Fernando Alva-Manchego

Links