The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

Abstract

In this article the I3Media corpus is presented, a trilingual (Catalan, English, Spanish) speech database of neutral and emotional material collected for analysis and synthesis purposes. The corpus is actually made up of six different subsets of material: a neutral subcorpus, containing emotionless utterances; a dialog' subcorpus, containing typical call center utterances; an emotional' corpus, a set of sentences representative of pure emotional states; a football' subcorpus, including utterances imitating a football broadcasting situation; a SMS' subcorpus, including readings of SMS texts; and a paralinguistic elements' corpus, including recordings of interjections and paralinguistic sounds uttered in isolation. The corpus was read by professional speakers (male, in the case of Spanish and Catalan; female, in the case of the English corpus), carefully selected to meet criteria of language competence, voice quality and acting conditions. It is the result of a collaboration between the Speech Technology Group at Telefónica Investigación y Desarrollo (TID) and the Speech and Language Group at Barcelona Media Centre d'Innovació (BM), as part of the I3Media project.