HomeLREC 2022WorkshopsSIGULlrec2022-ws-sigul-20
Back to SIGUL 2022
LREC 2022workshop

Evaluating Unsupervised Approaches to Morphological Segmentation for Wolastoqey

Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

DOI:10.63317/3m2hyjbcvpq8

Abstract

Finite-state approaches to morphological analysis have been shown to improve the performance of natural language processing systems for polysynthetic languages, in-which words are generally composed of many morphemes, for tasks such as language modelling (Schwartz et al., 2020). However, finite-state morphological analyzers are expensive to construct and require expert knowledge of a language’s structure. Currently, there is no broad-coverage finite-state model of morphology for Wolastoqey, also known as Passamaquoddy-Maliseet, an endangered low-resource Algonquian language. As this is the case, in this paper, we investigate using two unsupervised models, MorphAGram and Morfessor, to obtain morphological segmentations for Wolastoqey. We train MorphAGram and Morfessor models on a small corpus of Wolastoqey words and evaluate using two an notated datasets. Our results indicate that MorphAGram outperforms Morfessor for morphological segmentation of Wolastoqey.

Details

Paper ID
lrec2022-ws-sigul-20
Pages
pp. 155-160
BibKey
bear-cook-2022-evaluating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • DB

    Diego Bear

  • PC

    Paul Cook

Links