Back to Main Conference 2022
LREC 2022main

Building an Endangered Language Resource in the Classroom: Universal Dependencies for Kakataibo

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/2e7rkojnqzim

Abstract

In this paper, we launch a new Universal Dependencies treebank for an endangered language from Amazonia: Kakataibo, a Panoan language spoken in Peru. We first discuss the collaborative methodology implemented, which proved effective to create a treebank in the context of a Computational Linguistic course for undergraduates. Then, we describe the general details of the treebank and the language-specific considerations implemented for the proposed annotation. We finally conduct some experiments on part-of-speech tagging and syntactic dependency parsing. We focus on monolingual and transfer learning settings, where we study the impact of a Shipibo-Konibo treebank, another Panoan language resource.

Details

Paper ID
lrec2022-main-409
Pages
pp. 3840-3851
BibKey
zariquiey-etal-2022-building
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • RZ

    Roberto Zariquiey

  • CA

    Claudia Alvarado

  • XE

    Ximena Echevarría

  • LG

    Luisa Gomez

  • RG

    Rosa Gonzales

  • MI

    Mariana Illescas

  • SO

    Sabina Oporto

  • FB

    Frederic Blum

  • AO

    Arturo Oncevay

  • JV

    Javier Vera

Links