HomeLREC 2026WorkshopsLANLPlrec2026-ws-lanlp-04
Back to LANLP 2026
LREC 2026workshop

OpenCor: Latin American and Iberian Languages Open Corpora Forum

Proceedings of LANLP: Bridging Ibero and Latin American NLP Communities

DOI:10.63317/42vgyuuoduhr

Abstract

The availability of open resources and corpora is a fundamental requirement for research in Natural Language Processing (NLP) and Computational Linguistics; however, languages spoken in Latin America and the Iberian Peninsula, particularly Indigenous, minority, and regional varieties, remain structurally under-resourced and under-represented. This paper presents a historical account of OpenCor (Latin American and Iberian Languages Open Corpora Forum), a community-driven initiative created to promote, document, and discuss open linguistic corpora and lexical resources for these languages. Conceived as a collaborative forum rather than a competitive evaluation venue, OpenCor focuses on data creation, licensing practices, sustainability, and community building. Between 2018 and 2024, OpenCor was organized as a recurring workshop co-located with major conferences, fostering dialogue across countries, institutions, and linguistic traditions. By documenting the initiative’s motivations, organizational trajectory, submission trends, and the diversity of resources presented, this paper aims to preserve institutional memory, highlight the often-invisible labor of corpus development, and provide a reference for future initiatives dedicated to openness and linguistic diversity.

Details

Paper ID
lrec2026-ws-lanlp-04
Pages
pp. 21-28
BibKey
real-etal-2026-opencor
Editors
German Rigau Claramunt, Pablo Gamallo, Rafael Muñoz Guillena, Luis Chiruzzo, Eugenio Martínez Cámara
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of LANLP: Bridging Ibero and Latin American NLP Communities
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • LR

    Livy Real

  • Vd

    Valeria de Paiva

Links