Multi-level XML-based Corpus Annotation

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

Abstract

In this paper we present the methodological principles and the implementation framework of text annotation process in an Information Extraction setting. Due to the recent prevalence of XML as a means for describing structured documents in a reusable format, our team has switched to an XML based annotation schema. In that framework, an XML annotation platform has been built, while processing tools, lexical resources and textual data communicate with each other via this platform. Editing/viewing tools have been implemented, endowed with functionalities that allow annotators to gain access to previous annotation levels as well as necessary lexical resources.