HomeLREC 2026WorkshopsPOLITICALNLPlrec2026-ws-politicalnlp-07
Back to POLITICALNLP 2026
LREC 2026workshop

Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus

Proceedings of the Second Workshop on Building Educational Applications Using NLP

DOI:10.63317/39457jk69sij

Abstract

We present a unified pipeline for synthesizing high-quality Quechua and Spanish speech for the Peruvian Constitution using three state-of-the-art text-to-speech (TTS) architectures: XTTS v2, F5-TTS, and DiFlow-TTS. Our models are trained on independent Spanish and Quechua speech datasets with heterogeneous sizes and recording conditions, and leverage bilingual and multilingual TTS capabilities to improve synthesis quality in both languages. By exploiting cross-lingual transfer, our framework mitigates data scarcity in Quechua while preserving naturalness in Spanish. We release trained checkpoints, inference code, and synthesized audio for each constitutional article, providing a reusable resource for speech technologies in indigenous and multilingual contexts. This work contributes to the development of inclusive TTS systems for political and legal content in low-resource settings

Details

Paper ID
lrec2026-ws-politicalnlp-07
Pages
pp. 64-69
BibKey
ortega-etal-2026-giving
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Second Workshop on Building Educational Applications Using NLP
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • JO

    John E. Ortega

  • RZ

    Rodolfo Joel Zevallos

  • FC

    Fabrício Carraro

Links