Transfer Learning for Creole TTS: A Pilot Study on Whether Substrate Phonologies or Lexifier Vocabularies Matter More
Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages
Abstract
In this early-stage study, we investigate whether transfer learning from lexifier or substrate languages can improve text-to-speech (TTS) performance for low-resource creoles. We conducted a controlled experiment using two creoles of distinct lexical origins: Nigerian Pidgin (English-based) and Guadeloupean Creole (French-based). Single-speaker TTS datasets of approximately 30 minutes each were recorded and used to fine-tune pretrained models for English, French, and Yoruba. Objective metrics and informal subjective evaluations were employed to assess synthesis quality. Though partially inconclusive, our results suggest that the French-based models outperform others for both creoles, while Yoruba-based models yield weaker performance. These findings may suggest that lexical similarity or historic influences alone do not fully predict transfer learning effectiveness, and that phonotactic compatibility and orthographic depth may also be relevant factors. Our work provides insight into TTS model development for creoles and other low-resource languages, and highlights avenues for further research on leveraging relevant linguistic and orthographic features for model development.