GePaDeU - a Multi-layer Corpus of German Parliamentary Debates with Rich Semantic and Pragmatic Annotations
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
This paper presents GePaDeU, a new manually annotated corpus of German Parliamentary Debates with Unified layers of semantic and pragmatic information. The data includes parliamentary speeches from the German Bundestag, ranging over a time period from 2017–2021, with 267 speeches given by 197 members of parliament. The final release of our corpus unifies multiple annotation layers, including entity-level annotations, the annotation of speech events and their corresponding speakers, functional speech acts, clause-level aspect, and moral framing. We provide an overview of the various annotation layers and illustrate how the semantic and pragmatic annotations can be combined for corpus-linguistic studies and discourse analyses, and to answer research questions in the field of political science. The new resource will be made freely available for the research community.