Back to Main Conference 2016
LREC 2016main

PARC 3.0: A Corpus of Attribution Relations

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/277rsu645dra

Abstract

Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution and generated small or incomplete resources, thus limiting the applicability of machine learning approaches. This paper presents PARC 3.0, a large corpus fully annotated with Attribution Relations (ARs). The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, cue and content. The corpus, which comprises around 20k ARs, was used to investigate the range of structures that can express attribution. The results show a complex and varied relation of which the literature has addressed only a portion. PARC 3.0 is available for research use and can be used in a range of different studies to analyse attribution and validate assumptions as well as to develop supervised attribution extraction models.

Details

Paper ID
lrec2016-main-619
Pages
pp. 3914-3920
BibKey
pareti-2016-parc
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • SP

    Silvia Pareti

Links