Back to Main Conference 2006
LREC 2006main
Champollion: A Robust Parallel Text Sentence Aligner
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)
Abstract
This paper describes Champollion, a lexicon-based sentence aligner designed for robust alignment of potential noisy parallel text. Champollion increases the robustness of the alignment by assigning greater weights to less frequent translated words. Experiments on a manually aligned Chinese – English parallel corpus show that Champollion achieves high precision and recall on noisy data. Champollion can be easily ported to new language pairs. It’s freely available to the public.