Resources for Multilingual Text Generation in Three Slavic Languages
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000)
Abstract
The paper discusses the methods followed to re-use a large-scale, broad-coverage English grammar for constructing similar scale grammars for Bulgarian, Czech and Russian for the fast prototyping of a multilingual generation system. We present (1) the theoretical and methodological basis for resource sharing across languages, (2) the use of a corpus-based contrastive register analysis, in particular, contrastive analysis of mood and agency. Because the study concerns reuse of the grammar of a language that is typologically quite different from the languages treated, the issues addressed in this paper appear relevant to a wider range of researchers in need of large-scale grammars for less-researched languages.