Back to Main Conference 2014
LREC 2014main

On the use of a fuzzy classifier to speed up the Sp_ToBI labeling of the Glissando Spanish corpus

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/3ib8eox9meno

Abstract

In this paper, we present the application of a novel automatic prosodic labeling methodology for speeding up the manual labeling of the Glissando corpus (Spanish read news items). The methodology is based on the use of soft classification techniques. The output of the automatic system consists on a set of label candidates per word. The number of predicted candidates depends on the degree of certainty assigned by the classifier to each of the predictions. The manual transcriber checks the sets of predictions to select the correct one. We describe the fundamentals of the fuzzy classification tool and its training with a corpus labeled with Sp TOBI labels. Results show a clear coherence between the most confused labels in the output of the automatic classifier and the most confused labels detected in inter-transcriber consistency tests. More importantly, in a preliminary test, the real time ratio of the labeling process was 1:66 when the template of predictions is used and 1:80 when it is not.

Details

Paper ID
lrec2014-main-049
Pages
pp. 1962-1969
BibKey
escudero-etal-2014-use
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • DE

    David Escudero

  • LA

    Lourdes Aguilar-Cuevas

  • CG

    César González-Ferreras

  • YG

    Yurena Gutiérrez-González

  • VC

    Valentín Cardeñoso-Payo

Links