Back to Main Conference 2016
LREC 2016main

Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/57mr6snwv339

Abstract

This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce,by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specific challenges concerning the automatic translation of user-generated content are discussed in detail. Secondly, the paper highlights the risk of data sparsity inherent to e-commerce and it explores the state-of-the-art strategies to achieve domain adequacy via adaptation. Thirdly, it proposes a robust workflow for the development of SMT systems adapted to the e-commerce domain by relying on inexpensive methods. Given the scarcity of user-generated language corpora for most language pairs, the paper proposes to obtain monolingual target-language data to train language models and aligned parallel corpora to tune and evaluate MT systems by means of crowdsourcing.

Details

Paper ID
lrec2016-main-721
Pages
pp. 4550-4556
BibKey
barrera-etal-2016-enhancing
Editors
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 - 28 May 2016

Authors

  • MB

    Meritxell Fernández Barrera

  • VP

    Vladimir Popescu

  • AT

    Antonio Toral

  • FG

    Federico Gaspari

  • KC

    Khalid Choukri

Links