Back to Main Conference 2002
LREC 2002main
Statistical Machine Translation on Paraphrased Corpora
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
This paper presents a statistical machine translation trained on normalized corpora. The automatic paraphrasing is carried out by inducing paraphrasing expressions from a bilingual corpus. Then, the normalization is treated as a specic paraphrase of a given input determined by the frequency in a corpus. The experimental results on Japanese-to-English translation with normalized English corpus exhibited the reduction of word-error-rate by 8% and the improvement of subjective evaluation from 70% into 72.5%.