PIA-Core: Semantic Annotation through Example-based Learning

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

Abstract

This paper summarizes the aims and scope of the PIA (Portable Information Access) project’s PIA-Core system for automatic annotation of documents on the Semantic Web, i.e. the next generation World Wide Web. The focus of the project is to develop a portable information extraction system that can be easily adapted to new domains. PIA has its foundations on three resources: the PIA-Core information extraction module, application modules and PIA guidelines for ensuring consistent annotation. We are currently developing PIA-Core based on advanced machines learning methods to automatically annotate documents with terminology, names, temporal and quantity expressions etc. using examples of annotated documents.