Back to Main Conference 2022
LREC 2022main

Using Linguistic Typology to Enrich Multilingual Lexicons: the Case of Lexical Gaps in Kinship

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4gkptvktzgsp

Abstract

This paper describes a method to enrich lexical resources with content relating to linguistic diversity, based on knowledge from the field of lexical typology. We capture the phenomenon of diversity through the notion of lexical gap and use a systematic method to infer gaps semi-automatically on a large scale, which we demonstrate on the kinship domain. The resulting free diversity-aware terminological resource consists of 198 concepts, 1,911 words, and 37,370 gaps in 699 languages. We see great potential in the use of resources such as ours for the improvement of a variety of cross-lingual NLP tasks, which we illustrate through an application in the evaluation of machine translation systems.

Details

Paper ID
lrec2022-main-299
Pages
pp. 2798-2807
BibKey
khishigsuren-etal-2022-using
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • TK

    Temuulen Khishigsuren

  • GB

    Gábor Bella

  • KB

    Khuyagbaatar Batsuren

  • AF

    Abed Alhakim Freihat

  • NC

    Nandu Chandran Nair

  • AG

    Amarsanaa Ganbold

  • HK

    Hadi Khalilia

  • YC

    Yamini Chandrashekar

  • FG

    Fausto Giunchiglia

Links