SE Constructions Revisited: Focus on Treebanks for Romance Languages
Proceedings of the Ninth Workshop on Universal Dependencies (UDW 2026)
Abstract
We analyze the current annotation of SE constructions, i.e. verbal constructions marked by the clitic se and its cognates across five Romance languages (French, Italian, Portuguese (European and Brazilian), Romanian and Spanish) in several Universal Dependencies treebanks (version 2.17). We discuss the morphologic, syntactic and semantic characteristics of such constructions in each of the languages considered, both from a theoretical perspective and from that of existing annotation. To address inconsistencies in the data and strengthen Universal Dependencies as a scaffold for the automatic conversion of morphosyntactic annotation into semantic representations (Uniform Meaning Representation), we propose a clear distinction between argumental and non-argumental uses of the reflexive clitic, and outline systematic ways to implement this distinction in annotation guidelines. We also examine how some of the reported inconsistencies are handled in the treebanks under study and discuss the extent to which these practices can be extended to other treebanks, within the same or across different languages.