Back to Main Conference 2008
LREC 2008main
Reusable Tagset Conversion Using Tagset Drivers
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)
Abstract
Part-of-speech or morphological tags are important means of annotation in a vast number of corpora. However, different sets of tags are used in different corpora, even for the same language. Tagset conversion is difficult, and solutions tend to be tailored to a particular pair of tagsets. We propose a universal approach that makes the conversion tools reusable. We also provide an indirect evaluation in the context of a parsing task.