LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Design and Implementation of the Online ILSP Greek Corpus
Authors Hatzigeorgiu Nick (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, nikos@ilsp.gr)
Gavrilidou Maria (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, maria@ilsp.gr)
Piperidis Stelios (Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, 151 25, Athens, Greece, tel: +301 6875300, fax: +301 6854270, spip@ilsp.gr)
Carayannis George (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, gcara@ilsp.gr)
Papakostopoulou Anastasia (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, natassa@ilsp.gr)
Spiliotopoulou Athanassia (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, aspil@ilsp.gr)
Vacalopoulou Anna (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, avacalop@ilsp.gr)
Labropoulou Penny (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, penny@ilsp.gr)
Mantzari Elena (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, elena@ilsp.gr)
Papageorgiou Harris (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, xaris@ilsp.gr)
Demiros Iason (Institute for Language and Speech Processing Artemidos 6 & Epidavrou, 151 25, Athens, Greece, email: iason@ilsp.gr)
Keywords Corpus, Database, Greek Language, Internet, Web Application
Session Session WP8 - Corpus Tools
Full Paper 336.ps, 336.pdf
Abstract This paper presents the Hellenic National (HNC), which is the corpus of Modern Greek developed by the Institute for Language and Speech Processing (ILSP). The presentation describes all stages of the creation of the corpus: collection of the material, tagging and tokenizing, construction of the database and the online implementation which aims at rendering the corpus accessible over Internet to the research community.