Back to Main Conference 2016
LREC 2016main

Could Speaker, Gender or Age Awareness be beneficial in Speech-based Emotion Recognition?

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4kyenw4nkyas

Abstract

Emotion Recognition (ER) is an important part of dialogue analysis which can be used in order to improve the quality of Spoken Dialogue Systems (SDSs). The emotional hypothesis of the current response of an end-user might be utilised by the dialogue manager component in order to change the SDS strategy which could result in a quality enhancement. In this study additional speaker-related information is used to improve the performance of the speech-based ER process. The analysed information is the speaker identity, gender and age of a user. Two schemes are described here, namely, using additional information as an independent variable within the feature vector and creating separate emotional models for each speaker, gender or age-cluster independently. The performances of the proposed approaches were compared against the baseline ER system, where no additional information has been used, on a number of emotional speech corpora of German, English, Japanese and Russian. The study revealed that for some of the corpora the proposed approach significantly outperforms the baseline methods with a relative difference of up to 11.9%.

Details

Paper ID
lrec2016-main-010
Pages
pp. 61-68
BibKey
sidorov-etal-2016-speaker
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • MS

    Maxim Sidorov

  • AS

    Alexander Schmitt

  • ES

    Eugene Semenkin

  • WM

    Wolfgang Minker

Links