Back to Main Conference 2008
LREC 2008main

A Coreference Corpus and Resolution System for Dutch

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/25uhxow4ott7

Abstract

We present the main outcomes of the COREA project: a corpus annotated with coreferential relations and a coreference resolution system for Dutch. In the project we developed annotation guidelines for coreference resolution for Dutch and annotated a corpus of 135K tokens. We discuss these guidelines, the annotation tool, and the inter-annotator agreement. We also show a visualization of the annotated relations. The standard approach to evaluate a coreference resolution system is to compare the predictions of the system to a hand-annotated gold standard test set (cross-validation). A more practically oriented evaluation is to test the usefulness of coreference relation information in an NLP application. We run experiments with an Information Extraction module for the medical domain, and measure the performance of this module with and without the coreference relation information. We present the results of both this application-oriented evaluation of our system and of a standard cross-validation evaluation. In a separate experiment we also evaluate the effect of coreference information produced by a simple rule-based coreference module in a Question Answering application.

Details

Paper ID
lrec2008-main-094
Pages
N/A
BibKey
hendrickx-etal-2008-coreference
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • IH

    Iris Hendrickx

  • GB

    Gosse Bouma

  • FC

    Frederik Coppens

  • WD

    Walter Daelemans

  • VH

    Veronique Hoste

  • GK

    Geert Kloosterman

  • AM

    Anne-Marie Mineur

  • JV

    Joeri Van Der Vloet

  • JV

    Jean-Luc Verschelde

Links