Back to Main Conference 2026
LREC 2026main

GENIUS Keylog Corpus - a German High School Student Corpus with Keystroke Logging Data

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/4zdv5j2yj6vg

Abstract

Student writing has been studied either as a final product (arguments in an already written text) or as a writing process (keystroke data), but not in an integrated manner. We present Anonymised Keylog Corpus, the first publicly available dataset (as far as we know) that combines both comprehensive argumentative annotations with keystroke logging (259 German argumentative essays written by high school students). Our analysis reveals that 96% of students wrote linearly without recursion and 88% omitted the conclusion section. Writing was mainly characterised by fluent writing without extensive pauses, mainly due to the time limit for completing the task. Additionally we suggest methodology on how to combine annotations with keystroke events and carried out an explorative analysis of writer profiles.

Details

Paper ID
lrec2026-main-550
Pages
pp. 6919-6928
BibKey
schaller-etal-2026-genius
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • NS

    Nils-Jonathan Schaller

  • TJ

    Thorben Jansen

  • LH

    Lars Höft

  • HP

    Hannah Pünjer

  • AH

    Andrea Horbach

Links