Back to Main Conference 2026
LREC 2026main

Meta-Prompting Follow-Ups for Unsupervised Dialogue Evaluation Using Open-Source Large Language Models

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/4i8vxn9qi57r

Abstract

Automatically evaluating dialogue quality remains a major challenge due to the complexity and contextual variability of human interactions. This paper introduces DIET, a novel unsupervised, reference-free metric that uses follow-up utterances to assess dialogue quality. Unlike existing reference-free metrics, which rely on follow-ups derived from annotated data and apply a uniform set of utterances across all dialogues, DIET generates follow-ups using open-source Large Language Models (LLMs) and refines them through a selection process. Two strategies are explored: SELFMAP, where generation and evaluation are performed by the same model to ensure internal coherence, and CRAFT, where multiple models collaborate to generate diverse and complementary follow-ups, enhancing robustness and reducing model bias. Dialogue quality is measured via the likelihood of an LLM continuing the dialogue from selected follow-ups. Experiments show DIET better correlates with human judgments than existing reference-free metrics across multiple meta-evaluation datasets.

Details

Paper ID
lrec2026-main-220
Pages
pp. 2812-2824
BibKey
cimino-etal-2026-meta
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • GC

    Gaetano Cimino

  • CL

    Chuyuan Li

  • GC

    Giuseppe Carenini

  • VD

    Vincenzo Deufemia

Links