Back to Main Conference 2014
LREC 2014main

On Complex Word Alignment Configurations

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/4g4eird46cwe

Abstract

Resources of manual word alignments contain configurations that are beyond the alignment capacity of current translation models, hence the term complex alignment configuration. They have been the matter of some debate in the machine translation community, as they call for more powerful translation models that come with further complications. In this work we investigate instances of complex alignment configurations in data sets of four different language pairs to shed more light on the nature and cause of those configurations. For the English-German alignments from Padó and Lapata (2006), for instance, we find that only a small fraction of the complex configurations are due to real annotation errors. While a third of the complex configurations in this data set could be simplified when annotating according to a different style guide, the remaining ones are phenomena that one would like to be able to generate during translation. Those instances are mainly caused by the different word order of English and German. Our findings thus motivate further research in the area of translation beyond phrase-based and context-free translation modeling.

Details

Paper ID
lrec2014-main-338
Pages
pp. 1773-1780
BibKey
kaeshammer-westburg-2014-complex
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • MK

    Miriam Kaeshammer

  • AW

    Anika Westburg

Links