Back to Main Conference 2008
LREC 2008main
The AUTONOMATA Spoken Names Corpus
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)
Abstract
In the Autonomata project we have collected a corpus of spoken name utterances with manually corrected phonemic transcriptions of these utterances. The corpus was designed with the intention to become a major resource for the development of automatic speech recognition engines that can achieve a high accuracy on the recognition of person and geographical names spoken in Dutch. The recorded names were selected so as to reveal the major pronunciation variations that a speech recognizer of e.g. a navigation system with speech input is going to be confronted with. This includes native speakers speaking foreign names and vice versa.