Back to Main Conference 2006
LREC 2006main
Part-of-Speech Tagging of Transcribed Speech
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)
Abstract
We used four Part-of-Speech taggers, which are available for research purposes and were originally trained on text to tag a corpus of transcribed multiparty spoken dialogues. The assigned tags were then manually corrected. The correction was first used to evaluate the four taggers, then to retrain them. Despite limited resources in time, money and annotators we reached results comparable to those reported for the taggers on text. Based on our experience we present guidelines to produce reliably POS tagged corpora of new domains.