Back to Main Conference 2016
LREC 2016main

Crowdsourced Corpus with Entity Salience Annotations

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/3s7wuv3vcrhg

Abstract

In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.

Details

Paper ID
lrec2016-main-527
Pages
pp. 3307-3311
BibKey
dojchinovski-etal-2016-crowdsourced
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • MD

    Milan Dojchinovski

  • DR

    Dinesh Reddy

  • TK

    Tomáš Kliegr

  • TV

    Tomáš Vitvar

  • HS

    Harald Sack

Links