Title

Annotation of prominent words, prosodic boundaries and segmental lengthening by non-expert transcribers in the Spoken Dutch Corpus

Authors

Jeska Buhmann (Universiteit Gent, Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium)

Johanneke Caspers ( honetics Laboratory, Universiteit Leiden, Cleverinaplaats 1, PO Box 9515, 2300 RA Leiden, The Netherlands)

Vincent J. van Heuven (honetics Laboratory, Universiteit Leiden, Cleverinaplaats 1, PO Box 9515, 2300 RA Leiden, The Netherlands)

Heleen Hoekstra (UiL-OTS, Universiteit Utrecht, Trans 10, 3512 JK Utrecht, The Netherlands)

Jean-Pierre Martens (Universiteit Gent, Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium)

Marc Swerts (CNTS, Universitaire Instelling Antwerpen Universiteitsplein 1, B-2610 Wilrijk,  Belgium)

Session

SO4: Annotation Tools For Speech LRs

Abstract

This paper first describes the aims of the prosodic annotation for (part of) the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), and the procedures that are currently being developed to produce the annotation. It further reports on a pilot study that was run to estimate the costs and the attainable quality (in terms of inter-transcriber consistency) of the envisaged annotation. It is our claim that high-quality prosodic annotation (of prominence, prosodic breaks, and unusual segmental lengthening) can be obtained by nonexperts, provided these are given a strict, written protocol and a short period of supervision and feedback.

Keywords

Prominent words, Prosodic boundaries

Full Paper

96.pdf