Back to Main Conference 2026
LREC 2026main

TURING: Evaluating Human Abilities to Identify AI-Generated Texts

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/4e4ojwwryi8d

Abstract

This study analyzes humans’ ability to identify AI-generated texts across 10 genres. We collected 9164 annotations from 214 participants on 500 texts (half human, half LLM-produced), and analyzed 7943 after quality screening. Our main findings are that the humans accuracy was above chance but far from perfect (around 59%), with a slight tendency to label texts as "Human-generated". Their performance is influenced by the text genre (structural/factual formats easier to identify vs. complex genres) and by generating LLM. Annotators optionally selected three-level descriptors to justify decisions. While they had very limited effects on accuracy, their usage showed some association between text features (monotony, lack of cohesion or coherence) and "AI-generated" labeling. However, the linguistic features of the texts appear to have no robust impact after correction on human judgment. A small learning effect emerged but was practically negligible (0.1-0.2%), and personal characteristics of annotators had an impact on their accuracy, except age, which showed no effect. Finally, two automated detection tools were tested, reaching 88% accuracy on our distribution, clearly above humans, highlighting the value of human-tool combinations.

Details

Paper ID
lrec2026-main-355
Pages
pp. 4527-4535
BibKey
kalashnikova-etal-2026-turing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • NK

    Natalia Kalashnikova

  • NB

    Nicolas De Bufala

  • SF

    Sophie Fayad

  • LC

    Laurent Cervoni

Links