Back to Main Conference 2016
LREC 2016main

ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4vzwtadfojx8

Abstract

This paper presents a second release of the ARRAU dataset: a multi-domain corpus with thorough linguistically motivated annotation of anaphora and related phenomena. Building upon the first release almost a decade ago, a considerable effort had been invested in improving the data both quantitatively and qualitatively. Thus, we have doubled the corpus size, expanded the selection of covered phenomena to include referentiality and genericity and designed and implemented a methodology for enforcing the consistency of the manual annotation. We believe that the new release of ARRAU provides a valuable material for ongoing research in complex cases of coreference as well as for a variety of related tasks. The corpus is publicly available through LDC.

Details

Paper ID
lrec2016-main-326
Pages
pp. 2058-2062
BibKey
uryupina-etal-2016-arrau
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • OU

    Olga Uryupina

  • RA

    Ron Artstein

  • AB

    Antonella Bristot

  • FC

    Federica Cavicchio

  • KR

    Kepa Rodriguez

  • MP

    Massimo Poesio

Links