Back to Main Conference 2002
LREC 2002main

Learning of word sense disambiguation rules by Co-training, checking co-occurrence of features

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/3xeqgrk4od47

Abstract

In this paper, we propose a method to improve Co-training and apply it to word sense disambiguation problems. Co-training is an unsupervised learning method to overcome the problem that labeled training data is fairly expensive to obtain. Co-training is theoretically promising, but it requires two feature sets with the conditional independence assumption. This assumption is too rigid. In fact there is no choice but to use incomplete feature sets, and then the accuracy of learned rules reaches a limit. In this paper, we check co-occurrence between two feature sets to avoid such undesirable situation when we add unlabeled instances to training data. In experiments, we applied our method to word sense disambiguation problems for the three Japanese words ‘koe’, ‘toppu’ and ‘kabe’ and demonstrated that it improved Co-training.

Details

Paper ID
lrec2002-main-002
Pages
N/A
BibKey
shinnou-2002-learning
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • HS

    Hiroyuki Shinnou

Links