Back to Main Conference 2010
LREC 2010main

Training Parsers on Partial Trees: A Cross-language Comparison

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2a3cq75xfamn

Abstract

We present a study that compares data-driven dependency parsers obtained by means of annotation projection between language pairs of varying structural similarity. We show how the partial dependency trees projected from English to Dutch, Italian and German can be exploited to train parsers for the target languages. We evaluate the parsers against manual gold standard annotations and find that the projected parsers substantially outperform our heuristic baselines by 9―25% UAS, which corresponds to a 21―43% reduction in error rate. A comparative error analysis focuses on how the projected target language parsers handle subjects, which is especially interesting for Italian as an instance of a pro-drop language. For Dutch, we further present experiments with German as an alternative source language. In both source languages, we contrast standard baseline parsers with parsers that are enhanced with the predictions from large-scale LFG grammars through a technique of parser stacking, and show that improvements of the source language parser can directly lead to similar improvements of the projected target language parser.

Details

Paper ID
lrec2010-main-500
Pages
N/A
BibKey
spreyer-etal-2010-training
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • KS

    Kathrin Spreyer

  • Lilja Øvrelid

  • JK

    Jonas Kuhn

Links