Conversion of the Clark Hall Dictionary of Old English to TEI with RDF: An End-to-end Pipeline for Lexicographic Resource Retrodigitization
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
In this submission we introduce a workflow/pipeline for creating TEI editions of legacy dictionaries using a parser based on a context-free grammar (CFG). We do this by describing a project which we are currently carrying out and which aims to create a digital edition of an Old English dictionary, Clark-Hall’s "A Concise Anglo-Saxon Dictionary" using this approach. We begin the article by motivating our CFG-based approach, discussing its advantages and disadvantages, and comparing to it other approaches. We argue that this approach is suitable to certain kinds of dictionaries, such as Clark Hall’s. We then describe the microstructure of the dictionary itself with a view both to justifying the kinds of rules which we subsequently describe and to outlining the kinds of resources to which we believe our approach is best suited. We then describe the CFG parser itself and give an account of our experiments in parsing the dictionary. Finally, we outline the enrichment of the parsed dictionary with RDFa and the benefits it has for the published data.