Back to Main Conference 2014
LREC 2014main

A Graph-Based Approach for Computing Free Word Associations

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/22wc7jvpz3wq

Abstract

A graph-based algorithm is used to analyze the co-occurrences of words in the British National Corpus. It is shown that the statistical regularities detected can be exploited to predict human word associations. The corpus-derived associations are evaluated using a large test set comprising several thousand stimulus/response pairs as collected from humans. The finding is that there is a high agreement between the two types of data. The considerable size of the test set allows us to split the stimulus words into a number of classes relating to particular word properties. For example, we construct six saliency classes, and for the words in each of these classes we compare the simulation results with the human data. It turns out that for each class there is a close relationship between the performance of our system and human performance. This is also the case for classes based on two other properties of words, namely syntactic and semantic word ambiguity. We interpret these findings as evidence for the claim that human association acquisition must be based on the statistical analysis of perceived language and that when producing associations the detected statistical regularities are replicated.

Details

Paper ID
lrec2014-main-105
Pages
pp. 3027-3033
BibKey
enguix-etal-2014-graph
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • GE

    Gemma Bel Enguix

  • RR

    Reinhard Rapp

  • MZ

    Michael Zock

Links