Back to Main Conference 2026
LREC 2026main

From Print to Digital and beyond: The Retrodigitization of a Historical Dictionary of Italian as a Hybrid Lexical Resource

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/338howsz93sg

Abstract

This paper presents the retrodigitization project of the Grande Dizionario della Lingua Italiana (GDLI), the largest and most comprehensive historical dictionary of the Italian language. The GDLI’s 23,000 pages — originally designed for human consultation — constitute an exceptional repository of linguistic and cultural-historical information, while posing significant challenges to large-scale digitization and data structuring. The project, still ongoing, will result in the development of a set of interoperable and interlinked resources: (i) a TEI-XML edition of the dictionary text, encoding its complex lexicographic and citation structure; (ii) an annotated corpus of the quoted examples, enabling linguistic and historical research across centuries; and (iii) a database of cited authors and works. Together, these components form a hybrid lexical resource that establishes the foundations for innovative and advanced modes of accessing and exploring the rich and multifaceted content of this historical dictionary.

Details

Paper ID
lrec2026-main-057
Pages
pp. 770-777
BibKey
biffi-etal-2026-print
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • MB

    Marco Biffi

  • SC

    Sebastiana Cucurullo

  • MF

    Manuel Favaro

  • EG

    Elisa Guadagnini

  • SM

    Simonetta Montemagni

  • ES

    Eva Sassolini

Links