Summary of the paper

Title Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System
Authors Na-Rae Han, Joel Tetreault, Soo-Hwa Lee and Jin-Young Ha
Abstract This paper presents research on building a model of grammatical error correction, for preposition errors in particular, in English text produced by language learners. Unlike most previous work which trains a statistical classifier exclusively on well-formed text written by native speakers, we train a classifier on a large-scale, error-tagged corpus of English essays written by ESL learners, relying on contextual and grammatical features surrounding preposition usage. First, we show that such a model can achieve high performance values: 93.3% precision and 14.8% recall for error detection and 81.7% precision and 13.2% recall for error detection and correction when tested on preposition replacement errors. Second, we show that this model outperforms models trained on well-edited text produced by native speakers of English. We discuss the implications of our approach in the area of language error modeling and the issues stemming from working with a noisy data set whose error annotations are not exhaustive.
Topics Authoring tools, proofing, Grammar and Syntax, Language modelling
Full paper Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System
Slides Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System
Bibtex @InProceedings{HAN10.821,
  author = {Na-Rae Han and Joel Tetreault and Soo-Hwa Lee and Jin-Young Ha},
  title = {Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System},
  booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA