HomeLREC 2026WorkshopsDIALRESlrec2026-ws-dialres-04
Back to DIALRES 2026
LREC 2026workshop

Challenges in the Detection of Dialect for Historical Languages; the Case of Old Irish Text Resources

Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective

DOI:10.63317/2zf2k5jdm74v

Abstract

Old Irish presents particular challenges for the study of automatic dialect detection. It is generally accepted that Old Irish presents little trace of dialect. Extant Old Irish text resources introduce a considerable amount of extra variation, which could impact dialect identification applications. While some scholarship has suggested that certain features may be indicative of dialect, such hypotheses are difficult to substantiate where authorship is anonymous, or where the text itself is not associated with a particular geographical region. This paper describes the application of stylometric dialect detection techniques to Old Irish texts, and discusses the features which emerge from this process as potential markers of dialect. The aim is not necessarily to identify Old Irish dialectal features, but rather to investigate the impact that Old Irish text resources could have on such applications. This paper does, however, add to the extant body of research by highlighting some features which might be identified as stylistically distinct by stylometric dialect identification techniques.

Details

Paper ID
lrec2026-ws-dialres-04
Pages
pp. 33-47
BibKey
doyle-2026-challenges
Editors
Antonis Anastasopoulos, Stella Markantonatou, Angela Ralli, Marcos Zampieri, Stavros Bompolas, Vivian Stamou
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • AD

    Adrian Doyle

Links