Back to Main Conference 2008
LREC 2008main

System Evaluation on a Named Entity Corpus from Clinical Notes

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/3vomgg3feu6k

Abstract

This paper presents the evaluation of the dictionary look-up component of Mayo Clinic’s Information Extraction system. The component was tested on a corpus of 160 free-text clinical notes which were manually annotated with the named entity disease. This kind of clinical text presents many language challenges such as fragmented sentences and heavy use of abbreviations and acronyms. The dictionary used for this evaluation was a subset of SNOMED-CT with semantic types corresponding to diseases/disorders without any augmentation. The algorithm achieves an F-score of 0.56 for exact matches and F-scores of 0.76 and 0.62 for right and left-partial matches respectively. Machine learning techniques are currently under investigation to improve this task.

Details

Paper ID
lrec2008-main-365
Pages
N/A
BibKey
schuler-etal-2008-system
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • KS

    Karin Schuler

  • VK

    Vinod Kaggal

  • JM

    James Masanz

  • PO

    Philip Ogren

  • GS

    Guergana Savova

Links