Back to Main Conference 2010
LREC 2010main
Achieving Domain Specificity in SMT without Overt Siloing
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)
Abstract
We examine pooling data as a method for improving Statistical Machine Translation (SMT) quality for narrowly defined domains, such as data for a particular company or public entity. By pooling all available data, building large SMT engines, and using domain-specific target language models, we see boosts in quality, and can achieve the generalizability and resiliency of a larger SMT but with the precision of a domain-specific engine.