Back to Main Conference 2004
LREC 2004main

Publicly Available Topic Signatures for all WordNet Nominal Senses

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/38bec6uwmf4d

Abstract

Topic signatures are context vectors built for word senses and concepts. They can be automatically acquired from the web for any concept hierarchy using the ``monosemous relative'' method. Topic signatures have been shown to be useful in Word Sense Disambiguation, for modeling similarity between word senses, classifying new terms in hierarchies and also building hierarchical clusters of word senses for a given word. In this work we present a publicly available resource which comprises both automatically extracted examples for all WordNet 1.6 noun senses and topic signatures built based on those examples. We gathered around 700 sentences per each noun in WordNet. When the monosemous relatives are used to build a sense corpus for polysemous words, they comprise an average of around 3,500 sentences per word sense. The size of the topic signatures thus constructed is of around 4,500 words per word sense.

Details

Paper ID
lrec2004-main-487
Pages
N/A
BibKey
agirre-de-lacalle-2004-publicly
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • EA

    Eneko Agirre

  • Od

    Oier Lopez de Lacalle

Links