Back to Main Conference 2014
LREC 2014main
Chasing the Perfect Splitter: A Comparison of Different Compound Splitting Tools
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)
Abstract
This paper reports on the evaluation of two compound splitters for German. Compounding is a very frequent phenomenon in German and thus efficient ways of detecting and correctly splitting compound words are needed for natural language processing applications. This paper presents different strategies for compound splitting, focusing on German. Four compound splitters for German are presented. Two of them were used in Statistical Machine Translation (SMT) experiments, obtaining very similar qualitative scores in terms of BLEU and TER and therefore a thorough evaluation of both has been carried out.