Back to Home

Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

  1. Click the edit button next to a field to report a correction.
  2. Fill in the suggested correction value for each field you want to correct.
  3. Provide your name and email so we can contact you if needed.

Paper Information

lrec2026-ws-udw-21

Introducing Universal Dependencies for Sardinian: the UD ContSar Treebank

Paper Fields

Click the edit button next to a field to report a correction.

Title

Introducing Universal Dependencies for Sardinian: the UD ContSar Treebank

Abstract

This paper introduces the first steps towards the creation of a novel resource for contemporary Sardinian within the Universal Dependencies framework. Sardinian is a Romance language spoken in Sardinia, an island belonging to the Italian Republic and located in the center of the western Mediterranean. It is a minority and endangered language, traditionally transmitted mainly orally, and characterized by a multiplicity of varieties (usually grouped into two macro-varieties Logudorese and Campidanese), all recognized as part of the Sardinian linguistic continuum. These varieties share basic morphosyntactic features, while presenting differences at the lexical level and in the realization of specific constructions. This internal variation can be particularly challenging with regard to the normalization of lemmas and the linguistic characterization of certain phenomena. The development of the treebank therefore aims to provide an annotated resource for contemporary Sardinian that takes into account the specificities of the different varieties, using Universal Dependencies to represent them within a unified theoretical framework, in order to facilitate both linguistic analysis and automatic processing. The present paper thus describes some linguistic characteristics of Sardinian and the attempts to encode them within the UD framework. Finally, we present the results of our evaluation of an NLP pipeline for Sardinian, trained on our corpus, for the Stanford Stanza parser.


Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.


PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Author Declaration *

Select at least one field to correct using the edit buttons above.