Back to Main Conference 2018
LREC 2018main

The WAW Corpus: The First Corpus of Interpreted Speeches and their Translations for English and Arabic

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/4yztmfdo5eha

Abstract

This article presents the WAW Corpus, an interpreting corpus for English/Arabic, which can be used for teaching interpreters, studying the characteristics of interpreters’ work, as well as to train machine translation systems. The corpus contains recordings of lectures and speeches from international conferences, their interpretations, the transcripts of the original speeches and of their interpretations, as well as human translations of both kinds of transcripts into the opposite language of the language pair. The article presents the corpus curation, statistics, assessment, as well as a case study of the corpus use.

Details

Paper ID
lrec2018-main-336
Pages
N/A
BibKey
abdelali-etal-2018-waw
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • AA

    Ahmed Abdelali

  • IT

    Irina Temnikova

  • SH

    Samy Hedaya

  • SV

    Stephan Vogel

Links