HomeLREC 2026WorkshopsWILDRElrec2026-ws-wildre-06
Back to WILDRE 2026
LREC 2026workshop

Konkani Wordnet Resources

Proceedings of the 8th Workshop on Indian Language Data: Resources and Evaluation

DOI:10.63317/2sswqmbnfws3

Abstract

Konkani is a low-resource Indo-Aryan language spoken along the western coast of India, characterized by significant dialectal variation, multi-script usage, and limited standardized computational resources. This paper presents a consolidated and analysis-ready lexical resource derived from the Konkani Wordnet, built under the IndoWordNet framework. The resource comprises 32,370 synsets, 37,719 unique lexical entries, 32,370 glosses, and 33,318 example sentences, enriched with pronunciations, semantic relations, and illustrative examples. We describe the systematic extraction, normalization, and structural integration of wordnet data, resolving identifier inconsistencies and ensuring semantic coherence across distributed lexical files. To demonstrate the practical utility of this resource, we present an API-based bilingual vocabulary exercise generation system that leverages shared synset identifiers to automatically produce semantically aligned Hindi–Konkani word pairs for e-learning applications. The resulting resource enhances accessibility, reproducibility, and computational readiness for NLP tasks, while providing a foundational infrastructure for developing technology-driven teaching and learning tools for Konkani.

Details

Paper ID
lrec2026-ws-wildre-06
Pages
pp. 49-54
BibKey
redkar-etal-2026-konkani
Editors
Girish Nath Jha, Kalika Bali, Sobha L, Devendr Kumar
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 8th Workshop on Indian Language Data: Resources and Evaluation
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • HR

    Hanumant H. Redkar

  • MG

    Mahadev Gawas

  • AD

    Anjali Desai

  • JP

    Jyoti Pawar

Links