Back to Main Conference 2004
LREC 2004main

Towards the Use of Word Stems and Suffixes for Statistical Machine Translation

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/3iite3qjxdu3

Abstract

In this paper we present methods for improving the quality of translation from an inflected language into English by making use of part-of-speech tags and word stems and suffixes in the source language. Results for translations from Spanish and Catalan into English are presented on the LC-STAR trilingual corpus which consists of spontaneously spoken dialogues in the domain of travelling and appointment scheduling. Results for translation from Serbian into English are presented on the Assimil language course, the bilingual corpus from unrestricted domain. We achieve up to 5% relative reduction of error rates for Spanish and Catalan and about 8% for Serbian

Details

Paper ID
lrec2004-main-211
Pages
N/A
BibKey
popovic-ney-2004-towards
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • MP

    Maja Popović

  • HN

    Hermann Ney

Links