Back to Main Conference 2010
LREC 2010main

Evaluating Machine Translation Utility via Semantic Role Labels

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2yzeaktk7a4u

Abstract

We present the methodology that underlies mew metrics for semantic machine translation evaluation we are developing. Unlike widely-used lexical and n-gram based MT evaluation metrics, the aim of semantic MT evaluation is to measure the utility of translations. We discuss the design of empirical studies to evaluate the utility of machine translation output by assessing the accuracy for key semantic roles. These roles are from the English 5W templates (who, what, when, where, why) used in recent GALE distillation evaluations. Recent work by Wu and Fung (2009) introduced semantic role labeling into statistical machine translation to enhance the quality of MT output. However, this approach has so far only been evaluated using lexical and n-gram based SMT evaluation metrics like BLEU which are not aimed at evaluating the utility of MT output. Direct data analysis are still needed to understand how semantic models can be leveraged to evaluate the utility of MT output. In this paper, we discuss a new methodology for evaluating the utility of the machine translation output, by assessing the accuracy with which human readers are able to complete the English 5W templates.

Details

Paper ID
lrec2010-main-521
Pages
N/A
BibKey
lo-wu-2010-evaluating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • CL

    Chi-kiu Lo

  • DW

    Dekai Wu

Links