Evaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

Abstract

One of the challenges in automatic speech recognition is how to handle pronunciation variation. The main causes for pronunciation variation are the speaker (voice characteristics, accent, non-nativeness etc.) and the speaking style (reading, spontaneous responses, conversation etc.). An ASR system has basically two options for modelling the variation on the word and sub-word level: lexical modelling of the pronunciation variation or adaptation, i.e. re-training of the acoustic models. The answer to the question of which technique to choose, or how to combine them, may depend on the speaking style. We have therefore investigated the effects of using pronunciation variants for recognition of read speech, spontaneous dictation, and non-native speech. The variants in the standard purpose lexicon tested gave modest improvements and best results for read speech, which is the speaking style of the acoustic model training set.