The SpeDial datasets: datasets for Spoken Dialogue Systems analytics

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Abstract

The SpeDial consortium is sharing two datasets that were used during the SpeDial project. By sharing them with the community we are providing a resource to reduce the duration of cycle of development of new Spoken Dialogue Systems (SDSs). The datasets include audios and several manual annotations, i.e., miscommunication, anger, satisfaction, repetition, gender and task success. The datasets were created with data from real users and cover two different languages: English and Greek. Detectors for miscommunication, anger and gender were trained for both systems. The detectors were particularly accurate in tasks where humans have high annotator agreement such as miscommunication and gender. As expected due to the subjectivity of the task, the anger detector had a less satisfactory performance. Nevertheless, we proved that the automatic detection of situations that can lead to problems in SDSs is possible and can be a promising direction to reduce the duration of SDS's development cycle.

Resources

Details

Paper ID

lrec2016-main-016

Pages

pp. 104-110

DOI

10.63317/394xkfj7sefd

BibKey

lopes-etal-2016-spedial

Editors

Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

978-2-9517408-9-1

Conference

Tenth International Conference on Language Resources and Evaluation

Location

Portorož, Slovenia

Date

23 - 28 May 2016

Authors

JL
José Lopes
AC
Arodami Chorianopoulou
EP
Elisavet Palogiannidi
HM
Helena Moniz
AA
Alberto Abad
KL
Katerina Louka
EI
Elias Iosif
AP
Alexandros Potamianos

Links

URL

DOI