Introduction of KIBS (Korean Information Base System) Project

Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000)

Abstract

This project has been carried out on the basis of resources and tools for Korean NLP. The main research is the construction of raw corpus of 64 million tokens and Part-of-Speech tagged corpus of about 11 million tokens. And we develop some analytic tools to construct and some supporting tools to navigate them. This paper represents the present state of the work carried out by the KIBS project. We introduce a KAIST tag set of POS and syntax for standard corpus and annotation principles. And we explain several error types represented in tagged corpus.