Back to Main Conference 2010
LREC 2010main

Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/4ph63v3wh9oj

Abstract

Collocations play a significant role in second language acquisition. In order to be able to offer efficient support to learners, an NLP-based CALL environment for learning collocations should be based on a representative collocation error annotated learner corpus. However, so far, no theoretically-motivated collocation error tag set is available. Existing learner corpora tag collocation errors simply as “lexical errors” ― which is clearly insufficient given the wide range of different collocation errors that the learners make. In this paper, we present a fine-grained three-dimensional typology of collocation errors that has been derived in an empirical study from the learner corpus CEDEL2 compiled by a team at the Autonomous University of Madrid. The first dimension captures whether the error concerns the collocation as a whole or one of its elements; the second dimension captures the language-oriented error analysis, while the third exemplifies the interpretative error analysis. To facilitate a smooth annotation along this typology, we adapted Knowtator, a flexible off-the-shelf annotation tool implemented as a Protégé plugin.

Details

Paper ID
lrec2010-main-520
Pages
N/A
BibKey
ramos-etal-2010-towards
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • MR

    Margarita Alonso Ramos

  • LW

    Leo Wanner

  • OV

    Orsolya Vincze

  • Gd

    Gerard Casamayor del Bosque

  • NV

    Nancy Vázquez Veiga

  • ES

    Estela Mosqueira Suárez

  • SG

    Sabela Prieto González

Links