Back to Main Conference 2014
LREC 2014main
ACTIV-ES: a comparable, cross-dialect corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)
Abstract
Corpus resources for Spanish have proved invaluable for a number of applications in a wide variety of fields. However, a majority of resources are based on formal, written language and/or are not built to model language variation between varieties of the Spanish language, despite the fact that most language in everyday use is informal/ dialogue-based and shows rich regional variation. This paper outlines the development and evaluation of the ACTIV-ES corpus, a first-step to produce a comparable, cross-dialect corpus representative of the everyday language of various regions of the Spanish-speaking world.