HomeLREC 2020WorkshopsSLTUlrec2020-ws-sltu-17
Back to SLTU 2020
LREC 2020workshop

Fully Convolutional ASR for Less-Resourced Endangered Languages

Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

DOI:10.63317/4nja58sq9hy4

Abstract

The application of deep learning to automatic speech recognition (ASR) has yielded dramatic accuracy increases for languages with abundant training data, but languages with limited training resources have yet to see accuracy improvements on this scale. In this paper, we compare a fully convolutional approach for acoustic modelling in ASR with a variety of established acoustic modeling approaches. We evaluate our method on Seneca, a low-resource endangered language spoken in North America. Our method yields word error rates up to 40% lower than those reported using both standard GMM-HMM approaches and established deep neural methods, with a substantial reduction in training time. These results show particular promise for languages like Seneca that are both endangered and lack extensive documentation.

Details

Paper ID
lrec2020-ws-sltu-17
Pages
pp. 126-130
BibKey
thai-etal-2020-fully
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Location
undefined, undefined
Date
11 May 2020 16 May 2020

Authors

  • BT

    Bao Thai

  • RJ

    Robert Jimerson

  • RP

    Raymond Ptucha

  • EP

    Emily Prud’hommeaux

Links