HomeLREC 2026WorkshopsLT4HALAlrec2026-ws-lt4hala-47
Back to LT4HALA 2026
LREC 2026workshop

Extending omnes flores for the EvaLatin 2026 Dependency Parsing Tasks

Proceedings of the Fourth Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA 2026) @ LREC 2026

DOI:10.63317/5nmhkkp3qheh

Abstract

omnes flores is an NLP framework based on Universal Dependencies (UD) that utilizes multilingual Large Language Models (LLMs), and its default model is trained on data from 40 UD languages comprising 40 treebanks. For the EvaLatin 2026 Dependency Parsing Tasks, we extended the training data of omnes flores by incorporating six public Latin treebanks from UD and trained a dependency parsing model using the extended training data. The dependency parser of omnes flores normally takes a list of word FORM values as input. However, since the EvaLatin 2026 test data includes an UPOS column, we investigated whether incorporating both FORM and UPOS during both training and inference could improve parsing accuracy. Our experiments show that training using both FORM and UPOS improves performance by 0.5-1.0 LAS points on Prose compared with training using only FORM, but decreases performance by 5 points on Poetry.

Details

Paper ID
lrec2026-ws-lt4hala-47
Pages
pp. 448-452
BibKey
matsuda-etal-2026-extending
Editors
Rachele Sprugnoli, Marco Passarotti
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Fourth Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA 2026) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • HM

    Hiroshi Matsuda

  • MA

    Masayuki Asahara

Links