Back to Main Conference 2006
LREC 2006main

A Part-of-speech tagger for Irish using Finite-State Morphology and Constraint Grammar Disambiguation

Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)

DOI:10.63317/5dk8zd6572ok

Abstract

This paper describes the methodology used to develop a part-of-speech tagger for Irish, which is used to annotate a corpus of 30 million words of text with part-of-speech tags and lemmas. The tagger is evaluated using a manually disambiguated test corpus and it currently achieves 95% accuracy on unrestricted text. To our knowledge, this is the first part-of-speech tagger for Irish.

Details

Paper ID
lrec2006-main-103
Pages
N/A
BibKey
ui-dhonnchadha-van-genabith-2006-part
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-2-4
Conference
Fifth International Conference on Language Resources and Evaluation
Location
Genoa, Italy
Date
24 May 2006 26 May 2006

Authors

  • EU

    E. Uí Dhonnchadha

  • JV

    J. Van Genabith

Links