DR-CUP: A Dataset on Real-time Commentary in U.S. Presidential Debates
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Presidential debates are critical platforms for political discourse, yet existing research lacks datasets tailored for analyzing real-time professional commentary. To address this, we introduce the Dataset on Real-time Commentary in U.S. Presidential debates (DR-CUP), which aligns U.S. presidential debate transcripts (2016–2024) with professional commentary and annotations. DR-CUP supports research on commentary understanding, planning, and generation, offering insights into expert analysis and its role in contextualizing complex political discourse. In pilot studies, we evaluated state-of-the-art large language models (LLMs), revealing notable performance differences in understanding expert commentary and planning for generating professional commentary. DR-CUP is the first dataset to incorporate real-time cross-document alignment for debate data, providing a comprehensive resource for advancing research in political communication and computational social science.