Back to Main Conference 2002
LREC 2002main

An Improved Algorithm for the Automatic Segmentation of Speech Corpora

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/2ud5367m2vke

Abstract

In this paper we describe an improved algorithm for the automatic segmentation of speech corpora. Apart from their usefulness in several speech technology domains, segmentations provide easy access to speech corpora by using time stamps to couple the orthographic transcription to the speech signal. The segmentation tool we propose is based on the Forward-Backward algorithm. The Forward-Backward method not only produces more accurate segmentation results than the traditionally used Viterbi method, it also provides us with a confidence interval for each of the generated boundaries. These confidence intervals allow us to perform some advanced post-processing operations, leading to further improvement of the quality of automatic segmentations.

Details

Paper ID
lrec2002-main-010
Pages
N/A
BibKey
laureys-etal-2002-improved
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • TL

    Tom Laureys

  • KD

    Kris Demuynck

  • JD

    Jacques Duchateau

  • PW

    Patrick Wambacq

Links