Handling Cross-Dialect Syntactic Variation: a Theory-Driven Web Resource
Proceedings of the First Workshop on Dialects in NLP — A Resource Perspective
Abstract
Cross-dialect syntactic variation tests the limits of comparative analysis, owing to the entanglement of inheritance and contact in dialect systems. Addressing this challenge requires analytical tools combining the theoretical depth of formal models of grammatical competence with quantitative taxonomic techniques. The Parametric Comparison Method (PCM) embodies this integration by quantifying structural similarity across grammars through the comparison of abstract syntactic rules. The method has been shown to achieve a good degree of resolution in dialectal domains, capturing subtle contrasts and yielding configurations aligning with phylogenetic expectations while remaining sensitive to contact-induced convergence. Fully assessing its effectiveness as a resource for the quantitative study of syntactic dialectology, however, requires an infrastructure that ensures systematic data collection, consistent parameter setting, and robust statistical evaluation across diverse datasets. The PCM Hub is a web-based resource designed for this purpose. It integrates guided elicitation, automated parameter-setting procedures, data management, and the computation of distances and automatic classifications within a unified environment. By standardizing the transition from raw linguistic observations to a structured, replicable empirical apparatus, the PCM Hub provides the practical and quantitative support necessary to test the power of the PCM across expanded comparative domains.