Back to Main Conference 2014
LREC 2014main

Annotation of specialized corpora using a comprehensive entity and relation scheme

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/2jzmssd7547a

Abstract

Annotated corpora are essential resources for many applications in Natural Language Processing. They provide insight on the linguistic and semantic characteristics of the genre and domain covered, and can be used for the training and evaluation of automatic tools. In the biomedical domain, annotated corpora of English texts have become available for several genres and subfields. However, very few similar resources are available for languages other than English. In this paper we present an effort to produce a high-quality corpus of clinical documents in French, annotated with a comprehensive scheme of entities and relations. We present the annotation scheme as well as the results of a pilot annotation study covering 35 clinical documents in a variety of subfields and genres. We show that high inter-annotator agreement can be achieved using a complex annotation scheme.

Details

Paper ID
lrec2014-main-453
Pages
pp. 1267-1274
BibKey
deleger-etal-2014-annotation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • LD

    Louise Deléger

  • AL

    Anne-Laure Ligozat

  • CG

    Cyril Grouin

  • PZ

    Pierre Zweigenbaum

  • AN

    Aurélie Névéol

Links