HomeLREC 2022WorkshopsSIGULlrec2022-ws-sigul-10
Back to SIGUL 2022
LREC 2022workshop

CUNI Submission to MT4All Shared Task

Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

DOI:10.63317/59gvecb85e2x

Abstract

This paper describes our submission to the MT4All Shared Task in unsupervised machine translation from English to Ukrainian, Kazakh and Georgian in the legal domain. In addition to the standard pipeline for unsupervised training (pretraining followed by denoising and back-translation), we used supervised training on a pseudo-parallel corpus retrieved from the provided mono-lingual corpora. Our system scored significantly higher than the baseline hybrid unsupervised MT system.

Details

Paper ID
lrec2022-ws-sigul-10
Pages
pp. 78-82
BibKey
kvapilikova-bojar-2022-cuni
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • IK

    Ivana Kvapilíková

  • OB

    Ondrej Bojar

Links