Using a Cross-Language Information Retrieval System based on OHSUMED to Evaluate the Moses and KantanMT Statistical Machine Translation Systems

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Abstract

The objective of this paper was to evaluate the performance of two statistical machine translation (SMT) systems within a cross-language information retrieval (CLIR) architecture and examine if there is a correlation between translation quality and CLIR performance. The SMT systems were KantanMT, a cloud-based machine translation (MT) platform, and Moses, an open-source MT application. First we trained both systems using the same language resources: the EMEA corpus for the translation model and language model and the QTLP corpus for tuning. Then we translated the 63 queries of the OHSUMED test collection from Greek into English using both MT systems. Next, we ran the queries on the document collection using Apache Solr to get a list of the top ten matches. The results were compared to the OHSUMED gold standard. KantanMT achieved higher average precision and F-measure than Moses, while both systems produced the same recall score. We also calculated the BLEU score for each system using the ECDC corpus. Moses achieved a higher BLEU score than KantanMT. Finally, we also tested the IR performance of the original English queries. This work overall showed that CLIR performance can be better even when BLEU score is worse.

Resources

Details

Paper ID

lrec2016-main-057

Pages

pp. 368-372

DOI

10.63317/25re6rbweqdv

BibKey

katris-etal-2016-using

Editors

Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

978-2-9517408-9-1

Conference

Tenth International Conference on Language Resources and Evaluation

Location

Portorož, Slovenia

Date

23 - 28 May 2016

Authors

NK
Nikolaos Katris
RS
Richard Sutcliffe
TK
Theodore Kalamboukis

Links

URL

DOI