Back to Main Conference 2008
LREC 2008main

Learning the Species of Biomedical Named Entities from Annotated Corpora

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/3z6xmbmihxix

Abstract

In biomedical articles, terms with the same surface forms are often used to refer to different entities across a number of model organisms, in which case determining the species becomes crucial to term identification systems that ground terms to specific database identifiers. This paper describes a rule-based system that extracts “species indicating words”, such as human or murine, which can be used to decide the species of the nearby entity terms, and a machine-learning species disambiguation system that was developed on manually species-annotated corpora. Performance of both systems were evaluated on gold-standard datasets, where the machine-learning system yielded better overall results.

Details

Paper ID
lrec2008-main-074
Pages
N/A
BibKey
wang-grover-2008-learning
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • XW

    Xinglong Wang

  • CG

    Claire Grover

Links