Back to Main Conference 2016
LREC 2016main

Detecting Annotation Scheme Variation in Out-of-Domain Treebanks

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/28bcc7opj8ki

Abstract

To ensure portability of NLP systems across multiple domains, existing treebanks are often extended by adding trees from interesting domains that were not part of the initial annotation effort. In this paper, we will argue that it is both useful from an application viewpoint and enlightening from a linguistic viewpoint to detect and reduce divergence in annotation schemes between extant and new parts in a set of treebanks that is to be used in evaluation experiments. The results of our correction and harmonization efforts will be made available to the public as a test suite for the evaluation of constituent parsing.

Details

Paper ID
lrec2016-main-373
Pages
pp. 2354-2360
BibKey
versley-steen-2016-detecting
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • YV

    Yannick Versley

  • JS

    Julius Steen

Links