Multidialectal Spanish Modeling for ASR

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

Abstract

This paper describes the latest advances in our ongoing work in the area of Spanish multidialectal speech recognition. This work deals with the suitability of using a single multidialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system.It also allows to use a single system for all the Spanish speakers. Our latest experiments consist of the optimization of the acoustic models applying a top-down bottom-up hybrid clustering algorithm. Overall multidialectal acoustic modeling leads to maintain the performance of the recognition system even when it’s tested with an unseen dialect, that is, not seen in the training process.