Back to Main Conference 2002
LREC 2002main

Word Sense Disambiguation with Information Retrieval Technique

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/4zy8ehzsrraq

Abstract

This paper reports on word sense disambiguation of Korean nouns with information retrieval technique. First, context vectors are constructed using contextual words in training data. Then, the words in the context vector are weighted with local density. Each sense of a target word is represented as ¡®Static Sense Vector¡¯ in word space, which is the centroid of the context vectors. Contextual noise is removed using selective sampling. A selective sampling method use information retrieval technique, so as to enhance the discriminative power. We regard training samples as indexed documents and test samples as queries. We can retrieve relevant top-N training samples for a query (a test sample) and construct ¡®Dynamic Sense Vector¡¯ using the retrieved training samples. A word sense is estimated using the ¡®Static Sense Vector¡¯ and ¡®Dynamic Sense Vector¡¯. The Korean SENSEVAL test suit is used for this experiment and our method produces relatively good results.

Details

Paper ID
lrec2002-main-154
Pages
N/A
BibKey
oh-etal-2002-word
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • JO

    Jong-Hoon Oh

  • SS

    Saim Shin

  • YC

    Yong-Seok Choi

  • KC

    Key-Sun Choi

Links