From Lemmas to Links: A Lemma Bank for Ancient Greek
Proceedings of the Fourth Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA 2026) @ LREC 2026
Abstract
This paper introduces the Greek Lemmabank, the core component of the Linking Greek knowledge base, developed according to Linked Open Data principles. Addressing the fragmentation of existing Ancient Greek resources, the lemmabank adopts a descriptive, ontology-driven approach inspired by LiLa (Linking Latin). Lemmas are modelled as canonical forms within an OntoLex-Lemon–compliant framework, preserving alternative canonical solutions and dialectal variation while enabling interoperable linking across heterogeneous datasets. The resource is populated by integrating data from the Ancient Greek WordNet and the Liddell–Scott–Jones lexicon, with additional normalisation and harmonisation to the Universal Dependencies tagset. The resulting dataset establishes a lemma-centric infrastructure for interlinking corpora, lexica, and NLP tools for Ancient Greek.