Back to Main Conference 2012
LREC 2012main

Assessing Divergence Measures for Automated Document Routing in an Adaptive MT System

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/4j9uzhosq3nj

Abstract

Custom machine translation (MT) engines systematically outperform general-domain MT engines when translating within the relevant custom domain. This paper investigates the use of the Jensen-Shannon divergence measure for automatically routing new documents within a translation system with multiple MT engines to the appropriate custom MT engine in order to obtain the best translation. Three distinct domains are compared, and the impact of the language, size, and preprocessing of the documents on the Jensen-Shannon score is addressed. Six test datasets are then compared to the three known-domain corpora to predict which of the three custom MT engines they would be routed to at runtime given their Jensen-Shannon scores. The results are promising for incorporating this divergence measure into a translation workflow.

Details

Paper ID
lrec2012-main-502
Pages
pp. 3963-3970
BibKey
jaja-etal-2012-assessing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • CJ

    Claire Jaja

  • DB

    Douglas Briesch

  • JL

    Jamal Laoudi

  • CV

    Clare Voss

Links