Back to Main Conference 2004
LREC 2004main

Pattern Discovery in Named Organization Corpus

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/4qtuqftbt4in

Abstract

This paper presents how to mine formulation rules from a named organization corpus. The TEIRESIAS algorithm, which is widely used in bioinformatics domain, is adopted. The experimental results based on MET2 test bed show that the approach of regarding the morpheme of a keyword as a cluster is the best, the approach of regarding all the keywords as the same cluster is the next, and the approach of regarding each keyword as a cluster is the worse. The performance using morpheme-based approach is a little better than that of hand-crafted approach. The methodology can be easily extended to other types of named entities.

Details

Paper ID
lrec2004-main-039
Pages
N/A
BibKey
chen-chu-2004-pattern
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • HC

    Hsin-Hsi Chen

  • YC

    Yi-Lin Chu

Links