Back to Main Conference 2002
LREC 2002main

Designing speech database with prosodic variety for expressive TTS system

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/24fsfni4sch3

Abstract

For the purpose of building speech synthesis system that can generate high-quality speech with wide range in prosody and realize fine prosody control, we propose new speech database constructing method. As a speech synthesis method, we select a hybrid system which consists of two part : speech unit selection and prosody modification part by STRAIGHT (vocoder type high quality analysis-synthesis method). Our viewpoint for designing database is to reduce amount of prosody modification. which causes quality deterioration. Hence, to make it possible to generate arbitrary prosody within permissible range of prosody modification, we designed 9 sub-databases those consist of same phonetic balanced text set with different prosody. In this paper, we report the designing method and general features of obtained databases. Listening tests focused on durational fearure were also conducted. The results show effectiveness of the method and the necessity to change unit selection cost according to speech rate.

Details

Paper ID
lrec2002-main-337
Pages
N/A
BibKey
kawanami-etal-2002-designing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • HK

    Hiromichi Kawanami

  • TM

    Tsuyoshi Masuda

  • TT

    Tomoki Toda

  • KS

    Kiyohiro Shikano

Links