Back to Main Conference 2008
LREC 2008main

Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms.

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/2ioi6zdthgxa

Abstract

Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we use three vector-based models to retrieve semantically related words for a set of Dutch nouns and we analyse whether three linguistic properties of the nouns influence the results. In particular, we compare results from a dependency-based model with those from a 1st and 2nd order bag-of-words model and we examine the effect of the nouns’ frequency, semantic speficity and semantic class. We find that all three models find more synonyms for high-frequency nouns and those belonging to abstract semantic classses. Semantic specificty does not have a clear influence.

Details

Paper ID
lrec2008-main-204
Pages
N/A
BibKey
heylen-etal-2008-modelling
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • KH

    Kris Heylen

  • YP

    Yves Peirsman

  • DG

    Dirk Geeraerts

  • DS

    Dirk Speelman

Links