Authorship Attribution in the Times of LLMs within the Framework of the CRediT Taxonomy
Proceedings of the Joint Workshop on Legal and Ethical Issues in Human Language Technologies and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (LEGAL2026 and CALD-pseudo 2026) @ LREC 2026
Abstract
This article examines the concept of authorship in the context of generative language models and other uses of Artificial Intelligence, and how this new ‘authorshipness’ can be represented in metadata. It analyses authorship under copyright law and proposes a metadata-based approach to disclosing the use of AI in publications, drawing on the widely adopted CRediT taxonomy developed by the National Information Standards Organization (NISO), and informed by guidance from the United States Copyright Office (USCO) and the International Association of Scientific, Technical and Medical Publishers (STM).