Back to Main Conference 2016
LREC 2016main

SubCo: A Learner Translation Corpus of Human and Machine Subtitles

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/3tseo4jygqfs

Abstract

In this paper, we present a freely available corpus of human and automatic translations of subtitles. The corpus comprises, the original English subtitles (SRC), both human (HT) and machine translations (MT) into German, as well as post-editions (PE) of the MT output. HT and MT are annotated with errors. Moreover, human evaluation is included in HT, MT, and PE. Such a corpus is a valuable resource for both human and machine translation communities, enabling the direct comparison -- in terms of errors and evaluation -- between human and machine translations and post-edited machine translations.

Details

Paper ID
lrec2016-main-357
Pages
pp. 2246-2254
BibKey
martinez-vela-2016-subco
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • JM

    José Manuel Martínez Martínez

    Universität des Saarlandes

  • MV

    Mihaela Vela

    Universität des Saarlandes

Links