Back to Main Conference 2022
LREC 2022main

Standard German Subtitling of Swiss German TV content: the PASSAGE Project

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/44kxz3knsgaj

Abstract

In Switzerland, two thirds of the population speak Swiss German, a primarily spoken language with no standardised written form. It is widely used on Swiss TV, for example in news reports, interviews or talk shows, and subtitles are required for people who cannot understand this spoken language. This paper focuses on the task of automatic Standard German subtitling of spoken Swiss German, and more specifically on the translation of a normalised Swiss German speech recognition result into Standard German suitable for subtitles. Our contribution consists of a comparison of different statistical and deep learning MT systems for this task and an aligned corpus of normalised Swiss German and Standard German subtitles. Results of two evaluations, automatic and human, show that the systems succeed in improving the content, but are currently not capable of producing entirely correct Standard German.

Details

Paper ID
lrec2022-main-541
Pages
pp. 5063-5070
BibKey
mutal-etal-2022-standard
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • JM

    Jonathan Mutal

  • PB

    Pierrette Bouillon

  • JG

    Johanna Gerlach

  • VH

    Veronika Haberkorn

Links