GENIUS Keylog Corpus - a German High School Student Corpus with Keystroke Logging Data
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Student writing has been studied either as a final product (arguments in an already written text) or as a writing process (keystroke data), but not in an integrated manner. We present Anonymised Keylog Corpus, the first publicly available dataset (as far as we know) that combines both comprehensive argumentative annotations with keystroke logging (259 German argumentative essays written by high school students). Our analysis reveals that 96% of students wrote linearly without recursion and 88% omitted the conclusion section. Writing was mainly characterised by fluent writing without extensive pauses, mainly due to the time limit for completing the task. Additionally we suggest methodology on how to combine annotations with keystroke events and carried out an explorative analysis of writer profiles.