Back to Main Conference 2024
LREC-COLING 2024main

Task-Oriented Paraphrase Analytics

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/43n3dcqgt257

Abstract

Since paraphrasing is an ill-defined task, the term “paraphrasing” covers text transformation tasks with different characteristics. Consequently, existing paraphrasing studies have applied quite different (explicit and implicit) criteria as to when a pair of texts is to be considered a paraphrase, all of which amount to postulating a certain level of semantic or lexical similarity. In this paper, we conduct a literature review and propose a taxonomy to organize the 25 identified paraphrasing (sub-)tasks. Using classifiers trained to identify the tasks that a given paraphrasing instance fits, we find that the distributions of task-specific instances in the known paraphrase corpora vary substantially. This means that the use of these corpora, without the respective paraphrase conditions being clearly defined (which is the normal case), must lead to incomparable and misleading results.

Details

Paper ID
lrec2024-main-1360
Pages
pp. 15640-15654
BibKey
gohsen-etal-2024-task
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • MG

    Marcel Gohsen

  • MH

    Matthias Hagen

  • MP

    Martin Potthast

  • BS

    Benno Stein

Links