Back to Main Conference 2010
LREC 2010main

A Positional Tagset for Russian

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/4oej892praz6

Abstract

Fusional languages have rich inflection. As a consequence, tagsets capturing their morphological features are necessarily large. A natural way to make a tagset manageable is to use a structured system. In this paper, we present a positional tagset for describing morphological properties of Russian. The tagset was inspired by the Czech positional system (Hajic, 2004). We have used preliminary versions of this tagset in our previous work (e.g., Hana et al. (2004, 2006); Feldman (2006); Feldman and Hana (2010)). Here, we both systematize and extend these preliminary versions (by adding information about animacy, aspect and reflexivity); give a more detailed description of the tagset and provide comparison with the Czech system. Each tag of the tagset consists of 16 positions, each encoding one morphological feature (part-of-speech, detailed part-of-speech, gender, animacy, number, case, possessor's gender and number, person, reflexivity, tense, aspect, degree of comparison, negation, voice, variant). The tagset contains approximately 2,000 tags.

Details

Paper ID
lrec2010-main-555
Pages
N/A
BibKey
hana-feldman-2010-positional
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • JH

    Jirka Hana

  • AF

    Anna Feldman

Links