Back to Main Conference 2002
LREC 2002main
A Part-of-Speech-Based Search Algorithm for Translation Memories
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
The retrieval of related sentences in state-of-the-art translation memory systems is based on orthographic similarities. This often leads to poor search results, since orthographically similar sentences are not necessarily semantically related. In this paper we propose a search algorithm that aims to reduce this problem by taking part-of-speech information into account. It requires that the parallel sentences stored in the translation memory are processed using standard tools for word alignment and part-of-speech tagging. The work described is part of an ongoing project in example-based machine translation.