LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title A Unified POS Tagging Architecture and its Application to Greek
Authors Papageorgiou Harris (Institute for Language and Speech Processing, Epidavrou & Artemidos 6, 151 25 Maroussi, Greece, xaris@ilsp.gr)
Prokopidis Prokopis (Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, 151 25 Maroussi, Greece, prokopis@ilsp.gr)
Giouli Voula (Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, 151 25, Athens, Greece, tel: +301 6875300, fax: +301 6854270, voula@ilsp.gr)
Piperidis Stelios (Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, 151 25, Athens, Greece, tel: +301 6875300, fax: +301 6854270, spip@ilsp.gr)
Keywords Greek, POS Tagging, Transformation Based Learning, XML
Session Session WO18 - Morphology in Lexical and Textual Resources
Full Paper 181.ps, 181.pdf
Abstract This paper proposes a flexible and unified tagging architecture that could be incorporated into a number of applications like information extraction, cross-language information retrieval, term extraction, or summarization, while providing an essential component for subsequent syntactic processing or lexicographical work. A feature-based multi-tiered approach (FBT tagger) is introduced to part-of-speech tagging. FBT is a variant of the well-known transformation based learning paradigm aiming at improving the quality of tagging highly inflective languages such as Greek. Additionally, a large experiment concerning the Greek language is conducted and results are presented for a variety of text genres, including financial reports, newswires, press releases and technical manuals. Finally, the adopted evaluation methodology is discussed.