Back to Main Conference 2022
LREC 2022main
A Free/Open-Source Morphological Analyser and Generator for Sakha
Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)
Abstract
We present, to our knowledge, the first ever published morphological analyser and generator for Sakha, a marginalised language of Siberia. The transducer, developed using HFST, has coverage of solidly above 90%, and high precision. In the development of the analyser, we have expanded linguistic knowledge about Sakha, and developed strategies for complex grammatical patterns. The transducer is already being used in downstream tasks, including computer assisted language learning applications for linguistic maintenance and computational linguistic shared tasks.