Back to Main Conference 2010
LREC 2010main

Creating a Coreference Resolution System for Italian

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2mzb8my546vf

Abstract

This paper summarizes our work on creating a full-scale coreference resolution (CR) system for Italian, using BART ― an open-source modular CR toolkit initially developed for English corpora. We discuss our experiments on language-specific issues of the task. As our evaluation experiments show, a language-agnostic system (designed primarily for English) can achieve a performance level in high forties (MUC F-score) when re-trained and tested on a new language, at least on gold mention boundaries. Compared to this level, we can improve our F-score by around 10% introducing a small number of language-specific changes. This shows that, with a modular coreference resolution platform, such as BART, one can straightforwardly develop a family of robust and reliable systems for various languages. We hope that our experiments will encourage researchers working on coreference in other languages to create their own full-scale coreference resolution systems ― as we have mentioned above, at the moment such modules exist only for very few languages other than English.

Details

Paper ID
lrec2010-main-523
Pages
N/A
BibKey
poesio-etal-2010-creating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • MP

    Massimo Poesio

  • OU

    Olga Uryupina

  • YV

    Yannick Versley

Links