Back to Main Conference 2006
LREC 2006main
Compiling large language resources using lexical similarity metrics for domain taxonomy learning
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)
Abstract
In this contribution we present a new methodology to compile large language resources for domain-specific taxonomy learning. We describe the necessary stages to deal with the rich morphology of an agglutinative language, i.e. Korean, and point out a second order machine learning algorithm to unveil term similarity from a given raw text corpus. The language resource compilation described is part of a fully automatic top-down approach to construct taxonomies, without involving the human efforts which are usually required.