LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Corpora of Slovene Spoken Language for Multi-lingual Applications
Authors Gros Jerneja (Faculty of Electrical Engineering, University of Ljubljana, Tržaška 25, 1001 Ljubljana, Slovenia, nejka@fe.uni-lj.si)
Mihelič France (Faculty of Electrical Engineering, University of Ljubljana, Tržaška 25, 1001 Ljubljana, Slovenia, mihelicf@fe.uni-lj.si)
Dobrišek Simon (Faculty of Electrical Engineering, Univercity of Ljubljana, Laboratory of Artificial Perception, Tržaška 25, 1000 Ljubljana, Slovenia, simond@fe.uni-lj.si)
Erjavec Tomaž (Dept. for Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia, tomaz.erjavecg@ijs.si)
Žganec Mario (Masterpoint R&D, Baznikova 40, 1000 Ljubljana, Slovenia, Mario@masterpoint.si)
Keywords Annotation Tools, Continuous Speech, Diphone Inventory, Speech Corpus, Spoken Commands
Session Session SP3 - Spoken Language Resources' Projects
Full Paper 288.ps, 288.pdf
Abstract The domain of spoken language technologies ranges from speech input and output systems to complex understanding and generation systems, including multi- modal systems of widely differing complexity (such as automatic dictation machines) and multilingual systems (for example automatic dialogue and translation systems). The definition of standards and evaluation methodologies for such systems involves the specification and development of highly specific spoken language corpus and lexicon resources, and measurement and evaluation tools (EAGLES Handbook 1997). This paper presents the MobiLuz spoken resources of the Slovene language, which will be made freely available for research purposes in speech technology and linguistics.