Back to Main Conference 2010
LREC 2010main

Combining Resources: Taxonomy Extraction from Multiple Dictionaries

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2mdg8tmutxc7

Abstract

The idea that dictionaries are a good source for (computational) information has been around for a long while, and the extraction of taxonomic information from them is something that has been attempted several times. However, such information extraction was typically based on the systematic analysis of the text of a single dictionary. In this paper, we demonstrate how it is possible to extract taxonomic information without any analysis of the specific text, by comparing the same lexical entry in a number of different dictionaries. Counting word frequencies in the dictionary entry for the same word in different dictionaries leads to a surprisingly good recovery of taxonomic information, without the need for any syntactic analysis of the entries in question nor any kind of language-specific treatment. As a case in point, we will show in this paper an experiment extracting hyperonymy relations from several Spanish dictionaries, measuring the effect that the different number of dictionaries have on the results.

Details

Paper ID
lrec2010-main-324
Pages
N/A
BibKey
nazar-janssen-2010-combining
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • RN

    Rogelio Nazar

  • MJ

    Maarten Janssen

Links