Towards Semantic Access and Interoperability in Digital Dialectal Atlases. A Case Study
Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective
Abstract
The increasing digital availability of dialectal atlases has significantly enhanced access to dialectal data and their potential for linguistic and cultural studies. However, despite their richness, such resources often remain difficult to integrate into contemporary data-driven research workflows, due to complex data structures and limitated interoperability. Most digital dialectal atlases still rely on traditional access models centered on maps, offering only implicit and coarse-grained semantic structures, which limits concept-based exploration and potential for integration with other linguistic resources in the Linguistic Linked Open Data (LLOD) ecosystem. This paper presents a case study carried out on the Atlante Lessicale Toscano (ALT), aimed at addressing these limitations through the introduction of an explicit semantic layer designed to support both user-oriented exploration and machine-actionable interoperability. While ALT already provides a conceptual organization of dialectal materials, this structure was originally conceived for human navigation and not for integration with other computational lexical-semantic resources. To bridge this gap, we align ALT concepts with ItalWordNet, leveraging its synset-based model as a widely adopted semantic backbone in NLP and LLOD infrastructures. The case study focuses on the domain of agriculture, whose historically grounded conceptual distinctions are often underrepresented in general-purpose lexical resources. The paper proposes a mapping strategy, analyzes coverage and mismatch patterns, and releases a new aligned resource mapping ALT agricultural concepts to ItalWordNet, thereby creating the prerequisites for interoperability and reusability of dialectal atlas data.