Back to Main Conference 2000
LREC 2000main
Design and Implementation of the Online ILSP Greek Corpus
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000)
Abstract
This paper presents the Hellenic National (HNC), which is the corpus of Modern Greek developed by the Institute for Language and Speech Processing (ILSP). The presentation describes all stages of the creation of the corpus: collection of the material, tagging and tokenizing, construction of the database and the online implementation which aims at rendering the corpus accessible over Internet to the research community.