Back to Main Conference 2022
LREC 2022main

Constructing a Lexical Resource of Russian Derivational Morphology

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4numj4zvp7rf

Abstract

Words of any language are to some extent related thought the ways they are formed. For instance, the verb ‘exempl-ify’ and the noun ‘example-s’ are both based on the word ‘example’, but the verb is derived from it, while the noun is inflected. In Natural Language Processing of Russian, the inflection is satisfactorily processed; however, there are only a few machine-trackable resources that capture derivations even though Russian has both of these morphological processes very rich. Therefore, we devote this paper to improving one of the methods of constructing such resources and to the application of the method to a Russian lexicon, which results in the creation of the largest lexical resource of Russian derivational relations. The resulting database dubbed DeriNet.RU includes more than 300 thousand lexemes connected with more than 164 thousand binary derivational relations. To create such data, we combined the existing machine-learning methods that we improved to manage this goal. The whole approach is evaluated on our newly created data set of manual, parallel annotation. The resulting DeriNet.RU is freely available under an open license agreement.

Details

Paper ID
lrec2022-main-298
Pages
pp. 2788-2797
BibKey
kyjanek-etal-2022-constructing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • LK

    Lukáš Kyjánek

  • OL

    Olga Lyashevskaya

  • AN

    Anna Nedoluzhko

  • DV

    Daniil Vodolazsky

  • Zdeněk Žabokrtský

Links