Back to Main Conference 2004
LREC 2004main
Creating Slovenian Language Resources for Development of Speech-to-speech Translation Components
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)
Abstract
Article brings detailed information about procedures of building Slovenian lexica within the LC-STAR project, and also detailed information about the size of that lexica. University of Maribor joined the LC-STAR project in order to provide appropriate language resources for developing speech-to-speech translation technology for Slovenian language. Lexica exists from three parts: 65.000 common words, 45.000 proper names and 6.000 special application domain words. All lexica will be morpho-syntactically tagged and phonetically transcribed. Quality of produced language resources is ensured by independent validation.