Back to Main Conference 2002
LREC 2002main

Statistical Machine Translation on Paraphrased Corpora

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/4xuyhdcugqpe

Abstract

This paper presents a statistical machine translation trained on normalized corpora. The automatic paraphrasing is carried out by inducing paraphrasing expressions from a bilingual corpus. Then, the normalization is treated as a specic paraphrase of a given input determined by the frequency in a corpus. The experimental results on Japanese-to-English translation with normalized English corpus exhibited the reduction of word-error-rate by 8% and the improvement of subjective evaluation from 70% into 72.5%.

Details

Paper ID
lrec2002-main-134
Pages
N/A
BibKey
watanabe-etal-2002-statistical
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • TW

    Taro Watanabe

  • MS

    Mitsuo Shimohata

  • ES

    Eiichiro Sumita

Links