LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System
Authors Rojc Matej (Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000 Maribor, matej.rojc@uni-mb.si)
Kačič Zdravko (Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000 Maribor, kacic@uni-mb.si)
Keywords Grapheme-to-Phoneme Conversion, Non-Uniform Units, Text Processing
Session Session SP1 - Phonetic Issues and Speech Synthesis
Full Paper 177.ps, 177.pdf
Abstract In the paper the development of Slovenian speech corpus for use in concatenative speech synthesis system being developed at University of Maribor, Slovenia, will be presented. The emphasis in the paper is the issue of maximising the usefulness of the defined speech corpus for concatenation purposes. Usefulness of the speech corpus very much depends on the corresponding text and can be increased if the appropriate text is chosen. In the approach we used, detailed statistics of the text corpora has been done, to be able to define the sentences, rich with non-uniform units like monophones, diphones and triphones.