Cross-Dataset Inconsistencies in Morphological Annotation: Evidence from Universal Dependencies

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

Ensuring annotation consistency is a challenging task in language dataset development. While difficulty is typically increasing at higher levels of linguistic complexity, we show that it is a critical issue even for fundamental linguistic tasks such as morphological annotation. Contrary to previous research that targeted intra-dataset inconsistencies, this study investigates inconsistencies across various pre-existing datasets for the same language. On the example of Universal Dependencies datasets, we examined what morphological categories exhibit the most disagreement. The analysis revealed that there are specific categories with low inconsistency score that indicates good agreement on these features (namely Case, Gender, Number and to a lesser extent Animacy). On the other hand, the Part-of-Speech (UPOS) tag stands out as a "red flag" due to high inconsistency score. Analysis of the most frequent inconsistencies suggest that they are dataset-specific artifacts rather than inherently language-specific phenomena.