Back to Main Conference 2002
LREC 2002main

Translation Unit Concerning Timing of Simultaneous Translation

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/3q6ciitm2bsd

Abstract

This paper discusses and proposes a translation unit for simultaneous translation using a machine translation system. Monologues, such as lectures or broadcast news, are used as the target of simultaneous speech translation. To date, a lot of research on speech translation has dealt with dialogues, especially travel conversations. Most of the speech translation systems in MT have treated a sentence as a translation unit. In the ATR travel conversation database, sentence length is less than 10 words on average. Therefore, most of the sentences are simple and almost all of the utterances are constructed in one or two sentences. However, the sentences of monologues are longer than travel dialogues. They have over 30 words (as in ``ASU-wo-YOMU,'' a TV news commentary program) on average, and most of the sentences are complex or compound. Accordingly, it is difficult to treat a sentence as a translation unit for monologues, and thus an appropriate translation unit needs to be found. Considering this, we hypothesized that an adequate translation unit of speech translation systems relates to the translation unit of a human simultaneous translator. Therefore, we collected simultaneous translation data from lectures by human translators and investigated the characteristics of monologues and simultaneous translatio

Details

Paper ID
lrec2002-main-136
Pages
N/A
BibKey
kashioka-2002-translation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • HK

    Hideki Kashioka

Links