Back to Main Conference 2006
LREC 2006main
Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)
Abstract
In the framework of the DIHANA project, we present the acquisitionprocess of a spontaneous speech dialogue corpus in Spanish. Theselected application consists of information retrieval by telephone for nationwide trains. A total of 900 dialogues from 225 users were acquired using the “Wizard of Oz” technique. In this work, we present the design and planning of the dialogue scenes and the wizard strategy used for the acquisition of the corpus. Then, we also present the acquisition tools and a description of the acquisition process.