Back to COGALEX 2024
LREC-COLING 2024workshop

How Human-Like Are Word Associations in Generative Models? An Experiment in Slovene

Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

DOI:10.63317/4hzze2dthtnx

Abstract

Large language models (LLMs) show extraordinary performance in a broad range of cognitive tasks, yet their capability to reproduce human semantic similarity judgements remains disputed. We report an experiment in which we fine-tune two LLMs for Slovene, a monolingual SloT5 and a multilingual mT5, as well as an mT5 for English, to generate word associations. The models are fine-tuned on human word association norms created within the Small World of Words project, which recently started to collect data for Slovene. Since our aim was to explore differences between human and model-generated outputs, the model parameters were minimally adjusted to fit the association task. We perform automatic evaluation using a set of methods to measure the overlap and ranking, and in addition a subset of human and model-generated responses were manually classified into four categories (meaning-, positionand form-based, and erratic). Results show that human-machine overlap is very small, but that the models produce a similar distribution of association categories as humans.

Details

Paper ID
lrec2024-ws-cogalex-05
Pages
pp. 42-48
BibKey
vintar-etal-2024-human
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • ŠV

    Špela Vintar

  • MB

    Mojca Brglez

  • Aleš Žagar

Links