Back to Main Conference 2014
LREC 2014main

ColLex.en: Automatically Generating and Evaluating a Full-form Lexicon for English

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/4ekt9s567q99

Abstract

The paper describes a procedure for the automatic generation of a large full-form lexicon of English. We put emphasis on two statistical methods to lexicon extension and adjustment: in terms of a letter-based HMM and in terms of a detector of spelling variants and misspellings. The resulting resource, \collexen, is evaluated with respect to two tasks: text categorization and lexical coverage by example of the SUSANNE corpus and the \openanc.

Details

Paper ID
lrec2014-main-075
Pages
pp. 3756-3760
BibKey
vor-der-bruck-etal-2014-collex
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • Tv

    Tim vor der Brück

  • AM

    Alexander Mehler

  • ZI

    Zahurul Islam

Links