Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
LLM-Assisted Spanish Dialect Corpus Construction
Paper Fields
Click the edit button next to a field to report a correction.
LLM-Assisted Spanish Dialect Corpus Construction
This study presents a multi-dialect, pragmatically annotated Spanish corpus designed to address persistent gaps in the representation of regional varieties and communicative functions in existing linguistic and NLP resources. The corpus focuses exclusively on Spanish dialects spoken in the Americas, selecting one representative dialect per country and incorporating a single neutral Castilian variety for comparative purposes. Dialects are organized into five regional groups: Mexican, Central American, Caribbean, South American, and Rioplatense Spanish. Corpus development follows a multi-stage workflow in which a seed lexicon composed of openly licensed material from sources such as Wikipedia, Project Gutenberg, and curated random and synthetic data is used to initiate the LLM-based text generation. Each base sentence is expanded into dialect-specific variants and annotated with pragmatic and domain labels, producing a fully parallel dataset that supports cross dialect comparison. A multi-stage correction pipeline combining automated scripts, controlled LLM-based editing, and manual review ensures syntactic well-formedness and dialectal authenticity while eliminating language-switching and hallucination errors. The final version of the corpus covers 20 dialects and contains, 40,000 annotated sentences, released in both JSON and plain-text formats for use in a wide range of NLP tasks.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.