Back to Main Conference 2026
LREC 2026main

DIDECO: An Annotated Dataset for Intent Detection in Digital Communications

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3xsu5rdnw8hi

Abstract

This paper presents DIDECO, the first annotated dataset specifically designed for detecting both explicit and implicit intents in digital communications. We address a critical gap in cybersecurity research by developing a comprehensive taxonomy that distinguishes between explicit communicative goals (what is requested) and implicit persuasion mechanisms (how compliance is engineered). Grounded in Speech Act Theory and persuasion psychology principles, our taxonomy encompasses 20 distinct intent categories across explicit and implicit intents. We annotated 220 LLM-generated spear-phishing emails using a multi-label protocol with six trained annotators, yielding 2,162 intent annotations that reveal the layered complexity of malicious communications. Our analysis demonstrates that sophisticated attacks employ multiple concurrent intents, combining explicit communicative goals with implicit persuasion strategies. This dataset provides resources for developing intent-aware detection systems capable of identifying sophisticated social engineering attacks through semantic analysis.

Details

Paper ID
lrec2026-main-542
Pages
pp. 6808-6822
BibKey
popovic-etal-2026-dideco
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • SP

    Senaid Popovic

  • DR

    Damien Riquet

  • MM

    Maxime Meyer

  • FL

    Fabien Lauer

  • YP

    Yannick Parmentier

Links