HomeLREC 2022WorkshopsSIGULlrec2022-ws-sigul-24
Back to SIGUL 2022
LREC 2022workshop

Introducing YakuToolkit. Yakut Treebank and Morphological Analyzer.

Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

DOI:10.63317/35ib7ve5g6wf

Abstract

This poster presents the first publicly available treebank of Yakut, a Turkic language spoken in Russia, and a morphological analyzer for this language. The treebank was annotated following the Universal Dependencies (UD) framework and the mor- phological analyzer can directly access and use its data. Yakut is an under-represented language whose prominence can be raised by making reliably annotated data and NLP tools that could process it freely accessible. The publication of both the treebank and the analyzer serves this purpose with the prospect of evolving into a benchmark for the development of NLP online tools for other languages of the Turkic family in the future.

Details

Paper ID
lrec2022-ws-sigul-24
Pages
pp. 185-188
BibKey
merzhevich-ferraz-gerardi-2022-introducing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • TM

    Tatiana Merzhevich

  • FF

    Fabrício Ferraz Gerardi

Links