HomeLREC 2022WorkshopsSIGULlrec2022-ws-sigul-04
Back to SIGUL 2022
LREC 2022workshop

ReadAlong Studio: Practical Zero-Shot Text-Speech Alignment for Indigenous Language Audiobooks

Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

DOI:10.63317/4a43bxxaahfy

Abstract

While the alignment of audio recordings and text (often termed “forced alignment”) is often treated as a solved problem, in practice the process of adapting an alignment system to a new, under-resourced language comes with significant challenges, requiring experience and expertise that many outside of the speech community lack. This puts otherwise “solvable” problems, like the alignment of Indigenous language audiobooks, out of reach for many real-world Indigenous language organizations. In this paper, we detail ReadAlong Studio, a suite of tools for creating and visualizing aligned audiobooks, including educational features like time-aligned highlighting, playing single words in isolation, and variable-speed playback. It is intended to be accessible to creators without an extensive background in speech or NLP, by automating or making optional many of the specialist steps in an alignment pipeline. It is well documented at a beginner-technologist level, has already been adapted to 30 languages, and can work out-of-the-box on many more languages without adaptation.

Details

Paper ID
lrec2022-ws-sigul-04
Pages
pp. 23-32
BibKey
littell-etal-2022-readalong
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • PL

    Patrick Littell

  • EJ

    Eric Joanis

  • AP

    Aidan Pine

  • MT

    Marc Tessier

  • DH

    David Huggins Daines

  • DT

    Delasie Torkornoo

Links