SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)
Abstract
SPEECON (Speech-Driven Interfaces for Consumer Devices) is a project which aims to develop voice-driven interfaces for consumer applications. Led by an industrial consortium, the project’s goal is to collect speech data for at least 20 languages and 600 speakers per language (mostly adults but children as well). Recorded in different environments which are expected to be representative for the future applications, the database corpus comprises both spontaneous and read speech, the latter including phonetically rich material, a large number of application commands and isolated items such as digits, names, etc. In order to safeguard consistency and high quality of the databases, all of them are subject to validation. This paper describes in detail the specifications of the databases as well as the validation procedure.