Back to Main Conference 2012
LREC 2012main

Towards Fully Automatic Annotation of Audio Books for TTS

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/5kj2guoujtk2

Abstract

Building speech corpora is a first and crucial step for every text-to-speech synthesis system. Nowadays, the use of statistical models implies the use of huge sized corpora that need to be recorded, transcribed, annotated and segmented to be usable. The variety of corpora necessary for recent applications (content, style, etc.) makes the use of existing digital audio resources very attractive. Among all available resources, audiobooks, considering their quality, are interesting. Considering this framework, we propose a complete acquisition, segmentation and annotation chain for audiobooks that tends to be fully automatic. The proposed process relies on a data structure, Roots, that establishes the relations between the different annotation levels represented as sequences of items. This methodology has been applied successfully on 11 hours of speech extracted from an audiobook. A manual check, on a part of the corpus, shows the efficiency of the process.

Details

Paper ID
lrec2012-main-363
Pages
pp. 975-980
BibKey
boeffard-etal-2012-towards
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • OB

    Olivier Boeffard

  • LC

    Laure Charonnat

  • SM

    Sébastien Le Maguer

  • DL

    Damien Lolive

Links