Back to Main Conference 2026
LREC 2026main

Beyond Lemmas and Syntax: Comparing Human and LLM-Generated Scientific Abstracts

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/4t9vnvcb943b

Abstract

In this study, we compare human-written (HWT) and machine-generated (MGT) abstracts of scientific papers, going beyond traditional lexical and syntactic analyses. We use an extensive corpus of publications on computational linguistics submitted to the Association of Computational Linguistics from mid 1950s to 2022. First, we generate abstracts with three state-of-the-art models (GPT-4o, Llama 3.1 and Qwen 2.5), providing the models with full texts of papers, and subsequently we compare these abstracts to those written by humans. We study the overall information content of abstracts, operationalised as surprisal, and the distribution of information in abstracts quantified as local Uniform Information Density (UID), both metrics related to the processing effort. Subsequently, we perform an extrinsic evaluation through topic modelling and clustering applying the BERTopic model. Our results show significant differences both in surprisal and UID, suggesting that abstracts generated by Llama are less cognitively demanding and show a more uniform distribution of information. Our topic modelling experiments show greater divergence between humans and LLMs than between LLM pairs. At the same time, Llama abstracts seem to be more semantically similar to those written by humans, standing in line with previous findings suggesting such similarity on lexical and syntactic level.

Details

Paper ID
lrec2026-main-304
Pages
pp. 3823-3832
BibKey
bagdasarov-etal-2026-beyond
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • SB

    Sergei Bagdasarov

  • DA

    Diego Alves

Links