Back to Main Conference 2016
LREC 2016main

Crowdsourced Corpus with Entity Salience Annotations

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/3s7wuv3vcrhg

Abstract

In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.

Details

Paper ID
lrec2016-main-527
Pages
pp. 3307-3311
BibKey
dojchinovski-etal-2016-crowdsourced
Editors
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 - 28 May 2016

Authors

  • MD

    Milan Dojchinovski

  • DR

    Dinesh Reddy

  • TK

    Tomáš Kliegr

  • TV

    Tomáš Vitvar

  • HS

    Harald Sack

Links