Back to Main Conference 2022
LREC 2022main

IndoUKC: A Concept-Centered Indian Multilingual Lexical Resource

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/4h3zh7rh3eaq

Abstract

We introduce the IndoUKC, a new multilingual lexical database comprised of eighteen Indian languages, with a focus on formally capturing words and word meanings specific to Indian languages and cultures. The IndoUKC reuses content from the existing IndoWordNet resource while providing a new model for the cross-lingual mapping of lexical meanings that allows for a richer, diversity-aware representation. Accordingly, beyond a thorough syntactic and semantic cleaning, the IndoWordNet lexical content has been thoroughly remodeled in order to allow a more precise expression of language-specific meaning. The resulting database is made available both for browsing through a graphical web interface and for download through the LiveLanguage data catalogue.

Details

Paper ID
lrec2022-main-303
Pages
pp. 2833-2840
BibKey
chandran-nair-etal-2022-indoukc
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • NC

    Nandu Chandran Nair

  • RV

    Rajendran S. Velayuthan

  • YC

    Yamini Chandrashekar

  • GB

    Gábor Bella

  • FG

    Fausto Giunchiglia

Links