Back to Main Conference 2010
LREC 2010main

Modified LTSE-VAD Algorithm for Applications Requiring Reduced Silence Frame Misclassification

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/4wztccwg2i48

Abstract

The LTSE-VAD is one of the best known algorithms for voice activity detection. In this paper we present a modified version of this algorithm, that makes the VAD decision not taking into account account the estimated background noise level, but the signal to noise ratio (SNR). This makes the algorithm robust not only to noise level changes, but also to signal level changes. We compare the modified algorithm with the original one, and with three other standard VAD systems. The results show that the modified version gets the lowest silence misclassification rate, while maintaining a reasonably low speech misclassification rate. As a result, this algorithm is more suitable for identification tasks, such as speaker or emotion recognition, where silence misclassification can be very harmful. A series of automatic emotion identification experiments are also carried out, proving that the modified version of the algorithm helps increasing the correct emotion classification rate.

Details

Paper ID
lrec2010-main-514
Pages
N/A
BibKey
luengo-etal-2010-modified
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • IL

    Iker Luengo

  • EN

    Eva Navas

  • IO

    Igor Odriozola

  • IS

    Ibon Saratxaga

  • IH

    Inmaculada Hernaez

  • IS

    Iñaki Sainz

  • DE

    Daniel Erro

Links