Back to Main Conference 2008
LREC 2008main

Toward Active Learning in Data Selection: Automatic Discovery of Language Features During Elicitation

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/2a4t8x5auvi7

Abstract

Data Selection has emerged as a common issue in language technologies. We define Data Selection as the choosing of a subset of training data that is most effective for a given task. This paper describes deductive feature detection, one component of a data selection system for machine translation. Feature detection determines whether features such as tense, number, and person are expressed in a language. The database of the The World Atlas of Language Structures provides a gold standard against which to evaluate feature detection. The discovered features can be used as input to a Navigator, which uses active learning to determine which piece of language data is the most important to acquire next.

Details

Paper ID
lrec2008-main-059
Pages
N/A
BibKey
clark-etal-2008-toward
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • JC

    Jonathan Clark

  • RF

    Robert Frederking

  • LL

    Lori Levin

Links