HomeLREC 2026WorkshopsDIALRESlrec2026-ws-dialres-34
Back to DIALRES 2026
LREC 2026workshop

Transfer Learning for an Endangered Slavic Variety: Dependency Parsing in Pomak Across Contact-Shaped Dialects

Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective

DOI:10.63317/2mrmccuuhkgf

Abstract

This paper presents new resources and baselines for Dependency Parsing in Pomak, an endangered Eastern South Slavic language with substantial dialectal variation and no widely adopted standard. We focus on the variety spoken in Turkey (Uzunköprü) and ask how well a dependency parser trained on the existing Pomak Universal Dependencies treebank, which was built primarily from the variety that is spoken in Greece, transfers across dialects. We run two experimental phases. First, we train a parser on the Greek-variety UD data and evaluate zero-shot transfer to Turkish-variety Pomak, quantifying the impact of phonological and morphosyntactic differences. Second, we introduce a new manually annotated Turkish-variety Pomak corpus of 650 sentences and show that, despite its small size, targeted fine-tuning substantially improves accuracy; performance is further boosted by cross-variety transfer learning that combines the two dialects.

Details

Paper ID
lrec2026-ws-dialres-34
Pages
pp. 345-354
BibKey
karakas-2026-transfer
Editors
Antonis Anastasopoulos, Stella Markantonatou, Angela Ralli, Marcos Zampieri, Stavros Bompolas, Vivian Stamou
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • SK

    Sercan Karakas

Links