Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

Wikipedia is a very popular source of encyclopedic knowledge which provides highly reliable articles in a variety of domains. This richness and popularity created a strong motivation among NLP researchers to develop relatedness measures between Wikipedia concepts. In this paper, we introduce WORD (Wikipedia Oriented Relatedness Dataset), a new type of concept relatedness dataset, composed of 19,276 pairs of Wikipedia concepts. This is the first human annotated dataset of Wikipedia concepts, whose purpose is twofold. On the one hand, it can serve as a benchmark for evaluating concept-relatedness methods. On the other hand, it can be used as supervised data for developing new models for concept relatedness prediction. Among the advantages of this dataset compared to its term-relatedness counterparts, are its built-in disambiguation solution, and its richness with meaningful multiword terms. Based on this benchmark we develop a new tool, named WORT (Wikipedia Oriented Relatedness Tool), for measuring the level of relatedness between pairs of concepts. We show that the relatedness predictions ofWORT outperform state of the art methods.

Resources

Details

Paper ID

lrec2018-main-408

Pages

N/A

DOI

10.63317/4vj8kcyrbv4e

BibKey

ein-dor-etal-2018-semantic

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

LE
Liat Ein Dor
AH
Alon Halfon
YK
Yoav Kantor
RL
Ran Levy
YM
Yosi Mass
RR
Ruty Rinott
ES
Eyal Shnarch
NS
Noam Slonim

Links

URL

DOI