Back to Main Conference 2016
LREC 2016main

Synset Ranking of Hindi WordNet

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/5fcyf3y5m9ac

Abstract

Word Sense Disambiguation (WSD) is one of the open problems in the area of natural language processing. Various supervised, unsupervised and knowledge based approaches have been proposed for automatically determining the sense of a word in a particular context. It has been observed that such approaches often find it difficult to beat the WordNet First Sense (WFS) baseline which assigns the sense irrespective of context. In this paper, we present our work on creating the WFS baseline for Hindi language by manually ranking the synsets of Hindi WordNet. A ranking tool is developed where human experts can see the frequency of the word senses in the sense-tagged corpora and have been asked to rank the senses of a word by using this information and also his/her intuition. The accuracy of WFS baseline is tested on several standard datasets. F-score is found to be 60%, 65% and 55% on Health, Tourism and News datasets respectively. The created rankings can also be used in other NLP applications viz., Machine Translation, Information Retrieval, Text Summarization, etc.

Details

Paper ID
lrec2016-main-485
Pages
pp. 3039-3043
BibKey
bhingardive-etal-2016-synset
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • SB

    Sudha Bhingardive

  • RS

    Rajita Shukla

  • JS

    Jaya Saraswati

  • LK

    Laxmi Kashyap

  • DS

    Dhirendra Singh

  • PB

    Pushpak Bhattacharyya

Links