Back to Main Conference 2026
LREC 2026main

The Construction of a Mixe Variant Parallel Corpus

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/4iizmd3in9aj

Abstract

We present the progress and challenges of constructing a Mixe-Spanish parallel corpus for Machine Translation. Mixe is a Mexican Indigenous Language that is spoken by more than 100, 000 speakers. In particular, we focus on the San Juan Guivicovic Mixe variant (mir). The resulting resource is available under an open research license (CC BY-NC-SA). It was created following a previous state-of-the-art methodology for Mexican indigenous languages. In this case, we used paid translators from the variant region. We present a baseline system.

Details

Paper ID
lrec2026-main-274
Pages
pp. 3456-3461
BibKey
ruiz-etal-2026-construction
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • IR

    Ivan Vladimir Meza Ruiz

  • DM

    Delfino Zacarias Marquez

  • MA

    Martha Elba Ramírez Andrés

  • VC

    Victoriano Santiago Cayetano

  • JA

    Jonathan Santiago Antonio

  • CM

    Carlos Daniel Hernández Mena

Links