HomeLREC 2020WorkshopsSLTUlrec2020-ws-sltu-36
Back to SLTU 2020
LREC 2020workshop

Voted-Perceptron Approach for Kazakh Morphological Disambiguation

Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

DOI:10.63317/4d8zqkkb9nsd

Abstract

This paper presents an approach of voted perceptron for morphological disambiguation for the case of Kazakh language. Guided by the intuition that the feature value from the correct path of analyses must be higher than the feature value of non-correct path of analyses, we propose the voted perceptron algorithm with Viterbi decoding manner for disambiguation. The approach can use arbitrary features to learn the feature vector for a sequence of analyses, which plays a vital role for disambiguation. Experimental results show that our approach outperforms other statistical and rule-based models. Moreover, we manually annotated a new morphological disambiguation corpus for Kazakh language.

Details

Paper ID
lrec2020-ws-sltu-36
Pages
pp. 258-264
BibKey
tolegen-etal-2020-voted
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Location
undefined, undefined
Date
11 May 2020 16 May 2020

Authors

  • GT

    Gulmira Tolegen

  • AT

    Alymzhan Toleu

  • RM

    Rustam Mussabayev

Links