Back to Main Conference 2026
LREC 2026main

Disentangling Approaches to Conversation Disentanglement: Fine-Tune or Learn from Scratch?

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3frwendckp7g

Abstract

Conversation disentanglement is the process of segmenting a stream of messages or utterances into separate conversations or "threads" that can be more easily understood and processed. We compare the performance of GPT-4o and GPT-4o Mini with deep learning models built from scratch for this task. We show that, using the same amount of training data, out-of-the-box GPT-4o performs poorly, and fine-tuning GPT-4o Mini results in performance comparable to learning small-size models from scratch (based on standard hand-crafted features for this task), with performance reaching 74.4% F1-score for prediction of links between messages and 45.3% F1-score for prediction of perfectly matching conversations. However, the fine-tuned GPT-4o Mini model underperforms when compared to models that utilize complex structural information. We also provide a new method for detailed analysis of the successes and failures of our models, and a new visualization method.

Details

Paper ID
lrec2026-main-229
Pages
pp. 2927-2941
BibKey
pal-etal-2026-disentangling
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • DP

    Debaditya Pal

  • AL

    Anton Leuski

  • RA

    Ron Artstein

  • DT

    David Traum

  • KG

    Kallirroi Georgila

Links