HomeLREC 2026WorkshopsDTFlrec2026-ws-dtf-09
Back to DTF 2026
LREC 2026workshop

DIN 19461: A National Standard for Derived Text Formats

Proceedings of Leveraging Derived Text Formats to Unlock Copyrighted Collections for Open Science (DTF) @ LREC 2026

DOI:10.63317/3u92avo9gg59

Abstract

We present DIN 19461:2026-06 (E), a German draft national standard that defines categories, terminology, and process requirements for Derived Text Formats (DTFs) created from text documents in natural language. The standard specifies enrichment and information reduction operations, requirements for combining multiple DTFs, and documentation obligations for publication, archiving, and reuse. Its aim is to enable legally compliant sharing and analysis of texts–especially where copyright or data protection prevents distributing originals–while maintaining scientific utility and reproducibility through explicit process and parameter recording. We outline the scope, the key concepts, the four core reduction operations (retain, delete, replace, randomise), together with examples across token-, structure-, and vector-based DTFs, and implications for infrastructures (e.g., ISO 24622-based metadata). Finally, we discuss limitations, open questions (e.g., reconstruction risks with modern ML models), and next steps for adoption and maintenance.

Details

Paper ID
lrec2026-ws-dtf-09
Pages
pp. 67-75
BibKey
trippel-etal-2026-din
Editors
Florian Barth, Keli Du, José Calvo Tello, Philippe Genêt, Piroska Lendvai, Christof Schöch, Thorsten Trippel
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Leveraging Derived Text Formats to Unlock Copyrighted Collections for Open Science (DTF) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • TT

    Thorsten Trippel

  • FB

    Florian Barth

  • JT

    Jose Calvo Tello

  • KD

    Keli Du

  • PG

    Philippe Genêt

  • DK

    Daniel Kurzawe

  • PL

    Peter Leinen

  • PL

    Piroska Lendvai

  • CS

    Christof Schöch

  • AW

    Andreas Witt

  • AZ

    Arden Zimmermann

Links