Back to Main Conference 2016
LREC 2016main

PARC 3.0: A Corpus of Attribution Relations

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/277rsu645dra

Abstract

Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution and generated small or incomplete resources, thus limiting the applicability of machine learning approaches. This paper presents PARC 3.0, a large corpus fully annotated with Attribution Relations (ARs). The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, cue and content. The corpus, which comprises around 20k ARs, was used to investigate the range of structures that can express attribution. The results show a complex and varied relation of which the literature has addressed only a portion. PARC 3.0 is available for research use and can be used in a range of different studies to analyse attribution and validate assumptions as well as to develop supervised attribution extraction models.

Details

Paper ID
lrec2016-main-619
Pages
pp. 3914-3920
BibKey
pareti-2016-parc
Editors
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 - 28 May 2016

Authors

  • SP

    Silvia Pareti

Links