Designing a Data Model for a Diachronic Sign Language Database: A Case Study of Nineteenth-Century Bohemian Sources

Proceedings of the LREC 2026 12th Workshop on the Representation and Processing of Sign Languages: Language in Motion

Abstract

Diachronic research on sign languages is limited by the fragmentary and heterogeneous nature of historical documentation. Eighteenth- and nineteenth-century printed texts and manuscripts contain valuable lexical data, but their descriptions vary in precision, terminology, and representational conventions. This paper proposes a structured data model for a diachronic sign language database designed to systematise such archival materials. The proposed model adopts a multi-layered architecture that separates primary evidence from analytical interpretation, distinguishes attested from inferred sign parameters, applies graded confidence levels, and encodes structural, iconic, and metaphorical properties in parallel layers. Detailed source metadata ensures traceability and explicit representation of uncertainty. The model is illustrated through sign attestations drawn from nineteenth century Bohemian sources. The case study demonstrates that even fragmentary records, most commonly documented in dictionaries and pedagogical materials through written descriptions or illustrations, can be systematically represented within a unified data model suitable for structured comparison and diachronic analysis. The proposed model may also provide a methodological basis for comparable work on other European sign languages.