Back to Main Conference 2012
LREC 2012main

Automatic MT Error Analysis: Hjerson Helping Addicter

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/4vvqg5vs7xyw

Abstract

We present a complex, open source tool for detailed machine translation error analysis providing the user with automatic error detection and classification, several monolingual alignment algorithms as well as with training and test corpus browsing. The tool is the result of a merge of automatic error detection and classification of Hjerson (Popović, 2011) and Addicter (Zeman et al., 2011) into the pipeline and web visualization of Addicter. It classifies errors into categories similar to those of Vilar et al. (2006), such as: morphological, reordering, missing words, extra words and lexical errors. The graphical user interface shows alignments in both training corpus and test data; the different classes of errors are colored. Also, the summary of errors can be displayed to provide an overall view of the MT system's weaknesses. The tool was developed in Linux, but it was tested on Windows too.

Details

Paper ID
lrec2012-main-160
Pages
pp. 2158-2163
BibKey
berka-etal-2012-automatic
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • JB

    Jan Berka

  • OB

    Ondřej Bojar

  • MF

    Mark Fishel

  • MP

    Maja Popović

  • DZ

    Daniel Zeman

Links