Modelling Legal Compliance in a Consent Wizard Application as Part of a Research-Centered and User-Oriented Data Infrastructure
Proceedings of the Joint Workshop on Legal and Ethical Issues in Human Language Technologies and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (LEGAL2026 and CALD-pseudo 2026) @ LREC 2026
Abstract
Recent research calls for data management infrastructures that explicitly operate within the bounds of ethical and legal constraints, and facilitate adherence to Open Science principles by integrating automated support for planning, collection, storage, use, reuse, and sharing of data within. Legal and ethical requirements of data processing have become increasingly complex, introducing administrative barriers to scientific research investigating data generated by human participants, which encompasses a vast majority of humanities research. In response to this, we present RUDI ("Research-centered User-oriented Data Infrastructure"), a modular framework grounded in an interdisciplinary approach informed by legal, computational and linguistic expertise. This paper introduces its first component; a configurable and dynamically adaptive consent form generator in the form of a "wizard" web application. We outline how legal aspects are modeled within, and highlight its concrete benefits for administrative aspects of research. Further, we discuss the contextualization of data within the research domain by leveraging the use of standardized ontology within the framework.