Improving Latvian Morphosyntactic Parsing with Pretrained Encoders and Analyzer-Constrained Decoding

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

We present a systematic evaluation of Latvian morphosyntactic parsing with pretrained transformer encoders in a unified joint architecture for tagging, lemmatization, and dependency parsing. We benchmark multilingual and Latvian-specific models and show that language-specific adaptation, even with modest in-language data, substantially improves performance. We further demonstrate that factored morphological modeling improves robustness and that integrating a Latvian morphological analyzer through constrained decoding yields consistent gains in XPOS tagging and lemmatization. The best system achieves new state-of-the-art results, reaching 95.22% XPOS accuracy, 98.72% lemma accuracy, and 93.19% LAS.