Back to Main Conference 2014
LREC 2014main

N³ - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/3x49zs558iyn

Abstract

Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

Details

Paper ID
lrec2014-main-662
Pages
pp. 3529-3533
BibKey
roder-etal-2014-n3
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • MR

    Michael Röder

  • RU

    Ricardo Usbeck

  • SH

    Sebastian Hellmann

  • DG

    Daniel Gerber

  • AB

    Andreas Both

Links