Back to Main Conference 2008
LREC 2008main

The BNC Parsed with RASP4UIMA

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/3vtqqkunfrb7

Abstract

We have integrated the RASP system with the UIMA framework (RASP4UIMA) and used this to parse the XML-encoded version of the British National Corpus (BNC). All original annotation is preserved, and parsing information, mainly in the form of grammatical relations, is added in an XML format. A few specific adaptations of the system to give better results with the BNC are discussed briefly. The RASP4UIMA system is publicly available and can be used to parse other corpora or document collections, and the final parsed version of the BNC will be deposited with the Oxford Text Archive.

Details

Paper ID
lrec2008-main-149
Pages
N/A
BibKey
andersen-etal-2008-bnc
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • ØA

    Øistein E. Andersen

  • JN

    Julien Nioche

  • TB

    Ted Briscoe

  • JC

    John Carroll

Links