Back to Main Conference 2010
LREC 2010main

Error Correction for Arabic Dictionary Lookup

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2xy4djonz58d

Abstract

We describe a new Arabic spelling correction system which is intended for use with electronic dictionary search by learners of Arabic. Unlike other spelling correction systems, this system does not depend on a corpus of attested student errors but on student- and teacher-generated ratings of confusable pairs of phonemes or letters. Separate error modules for keyboard mistypings, phonetic confusions, and dialectal confusions are combined to create a weighted finite-state transducer that calculates the likelihood that an input string could correspond to each citation form in a dictionary of Iraqi Arabic. Results are ranked by the estimated likelihood that a citation form could be misheard, mistyped, or mistranscribed for the input given by the user. To evaluate the system, we developed a noisy-channel model trained on students’ speech errors and use it to perturb citation forms from a dictionary. We compare our system to a baseline based on Levenshtein distance and find that, when evaluated on single-error queries, our system performs 28% better than the baseline (overall MRR) and is twice as good at returning the correct dictionary form as the top-ranked result. We believe this to be the first spelling correction system designed for a spoken, colloquial dialect of Arabic.

Details

Paper ID
lrec2010-main-303
Pages
N/A
BibKey
rytting-etal-2010-error
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • CR

    C. Anton Rytting

  • PR

    Paul Rodrigues

  • TB

    Tim Buckwalter

  • DZ

    David Zajic

  • BH

    Bridget Hirsch

  • JC

    Jeff Carnes

  • NL

    Nathanael Lynn

  • SW

    Sarah Wayland

  • CT

    Chris Taylor

  • JW

    Jason White

  • CB

    Charles Blake III

  • EB

    Evelyn Browne

  • CM

    Corey Miller

  • TP

    Tristan Purvis

Links