Back to Main Conference 2016
LREC 2016main
Crowdsourced Corpus with Entity Salience Annotations
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Abstract
In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.