Back to Main Conference 2026
LREC 2026main

Prompting Instruction-tuned LLMs for Semantic Similarity Values

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3kbjxx6989dg

Abstract

The impressive few-shot performance of generative decoder transformer language models at novel tasks has raised interest in using them to estimate lexical-semantic properties of words, word pairs or multi-word expressions. We explore the task of eliciting semantic similarity scores between word pairs through prompting, comparing these scores to human benchmarks. We investigate different prompting approaches, different model architectures and different languages using the Dutch, English and Mandarin Chinese SimLex-999 benchmarks. The results show that prompting each word pair individually yields better correlations, and that models struggle with the distinction between similarity and relatedness, just as static and contextual word embedding models did. The new, open-weight gpt-oss-20b model yields the highest correlation with human ratings out of the models we evaluated.

Details

Paper ID
lrec2026-main-891
Pages
pp. 11390-11403
BibKey
snelder-etal-2026-prompting
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • XS

    Xander Akiko Snelder

  • YH

    Yunchong Huang

  • JB

    Jelke Bloem

Links