HomeLREC 2026WorkshopsUDWlrec2026-ws-udw-17
Back to UDW 2026
LREC 2026workshop

MesoTree: Annotated Linguistic Resources for Quantitative Comparative Linguistic Analysis and NLP in Mesoamerica

Proceedings of the Ninth Workshop on Universal Dependencies (UDW 2026)

DOI:10.63317/2xvtti733shi

Abstract

One aspect of descriptive and documentary linguistic materials that is becoming increasingly important in the information age is that they be searchable, quantifiable, and comparable. In this paper, we describe an effort to create morphosyntactically-annotated corpora for a number of under-served Mesoamerican languages using Universal Dependencies. We describe the Mesoamerican linguistic area and languages involved in the project, the training and annotation process, and give a status report on the current state of the corpora. Finally, we describe a comparitive syntax experiment and train UD parsing models on the data, demonstrating the usefulness of UD for facilitating quantitative, comparative linguistic research.

Details

Paper ID
lrec2026-ws-udw-17
Pages
pp. 197-207
BibKey
pugh-etal-2026-mesotree
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Ninth Workshop on Universal Dependencies (UDW 2026)
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • RP

    Robert Pugh

  • FT

    Francis Tyers

  • RH

    Robert Henderson

Links