Back to Main Conference 2016
LREC 2016main

Discontinuous Verb Phrases in Parsing and Machine Translation of English and German

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/2pxp7y7zbbea

Abstract

In this paper, we focus on the verb-particle (V-Prt) split construction in English and German and its difficulty for parsing and Machine Translation (MT). For German, we use an existing test suite of V-Prt split constructions, while for English, we build a new and comparable test suite from raw data. These two data sets are then used to perform an analysis of errors in dependency parsing, word-level alignment and MT, which arise from the discontinuous order in V-Prt split constructions. In the automatic alignments of parallel corpora, most of the particles align to NULL. These mis-alignments and the inability of phrase-based MT system to recover discontinuous phrases result in low quality translations of V-Prt split constructions both in English and German. However, our results show that the V-Prt split phrases are correctly parsed in 90% of cases, suggesting that syntactic-based MT should perform better on these constructions. We evaluate a syntactic-based MT system on German and compare its performance to the phrase-based system.

Details

Paper ID
lrec2016-main-453
Pages
pp. 2839-2845
BibKey
loaiciga-gulordava-2016-discontinuous
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • SL

    Sharid Loáiciga

  • KG

    Kristina Gulordava

Links