A Historical Database for the Study of Obstruent-Lateral Palatalization in Ibero-Romance
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Studying irregular sound changes requires documenting not only words that underwent the change but also those that did not. Obstruent-lateral (OL) palatalization in Ibero-Romance, i.e., Galician, Portuguese, and Spanish, is one such change, exhibiting three distinctive patterns: unusual distribution (/pl fl kl/ typically palatalized but /bl gl/ rarely did), irregular implementation (not all eligible words underwent palatalization), and variable outcomes (dependent on obstruent voicing and cluster word position). This paper presents a cross-linguistic historical dataset of 659 inherited words from principally Galician, Portuguese, and Spanish, with and without palatalization, traceable to etyma containing OL clusters. The dataset draws on etymological dictionaries, philological works, and historical corpora. A digitalized version of the Diccionario Crítico Etimológico Castellano e Hispánico (Corominas and Pascual, 2012) served as the backbone for systematically identifying etyma containing OL clusters. The compiled corpus contains 473 words with certain etymologies and comparable coverage across the three languages. By providing the first comprehensive compilation of both palatal and non-palatal historical evidence, this dataset enables the systematic study of OL palatalization in Ibero-Romance.