Back to Main Conference 2010
LREC 2010main

Learning Based Java for Rapid Development of NLP Systems

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/3ppewyg87rep

Abstract

Today's natural language processing systems are growing more complex with the need to incorporate a wider range of language resources and more sophisticated statistical methods. In many cases, it is necessary to learn a component with input that includes the predictions of other learned components or to assign simultaneously the values that would be assigned by multiple components with an expressive, data dependent structure among them. As a result, the design of systems with multiple learning components is inevitably quite technically complex, and implementations of conceptually simple NLP systems can be time consuming and prone to error. Our new modeling language, Learning Based Java (LBJ), facilitates the rapid development of systems that learn and perform inference. LBJ has already been used to build state of the art NLP systems. In this paper, we first demonstrate that there exists a theoretical model that describes most NLP approaches adeptly. Second, we show how our improvements to the LBJ language enable the programmer to describe the theoretical model succinctly. Finally, we introduce the concept of data driven compilation, a translation process in which the efficiency of the generated code benefits from the data given as input to the learning algorithms.

Details

Paper ID
lrec2010-main-517
Pages
N/A
BibKey
rizzolo-roth-2010-learning
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • NR

    Nick Rizzolo

  • DR

    Dan Roth

Links