MTLens: Machine Translation Output Debugging

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

Abstract

The performance of Machine Translation (MT) systems varies significantly with inputs of diverging features such as topics, genres, and surface properties. Though there are many MT evaluation metrics that generally correlate with human judgments, they are not directly useful in identifying specific shortcomings of MT systems. In this demo, we present a benchmarking interface that enables improved evaluation of specific MT systems in isolation or multiple MT systems collectively by quantitatively evaluating their performance on many tasks across multiple domains and evaluation metrics. Further, it facilitates effective debugging and error analysis of MT output via the use of dynamic filters that help users hone in on problem sentences with specific properties, such as genre, topic, sentence length, etc. The interface can be extended to include additional filters such as lexical, morphological, and syntactic features. Aside from helping debug MT output, it can also help in identifying problems in reference translations and evaluation metrics.

Resources

Details

Paper ID

lrec2022-main-448

Pages

pp. 4221-4226

DOI

10.63317/4fyij242n3wn

BibKey

sharma-etal-2022-mtlens

Editors

Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis2020

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-38-2

Conference

Thirteenth Language Resources and Evaluation Conference

Location

Marseille, France

Date

20 - 25 June 2022

Authors

SS
Shreyas Sharma
KD
Kareem Darwish
LP
Lucas Pavanelli
TC
Thiago Castro Ferreira
MA
Mohamed Al-Badrashiny
KY
Kamer Ali Yuksel
HS
Hassan Sawaf

Links

URL

DOI