Back to Main Conference 2006
LREC 2006main

DanPASS - A Danish Phonetically Annotated Spontaneous Speech Corpus

Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006)

DOI:10.63317/2vyy49ibryvv

Abstract

A corpus is described consisting of non-scripted monologues and dialogues, recorded by 22 speakers, comprising a total of about 70.000 words, corresponding to well over 10 hours of speech. The monologues were recorded as one-way communication with blind partner where the speaker performed three different tasks: (S)he described a network consisting of various geometrical shapes in various colours. (S)he guided the listener through four different routes in a virtual city map.(S)he instructed the listener how to build a house from its individual parts. The dialogues are replicas of the HCRC map tasks (http://www.hcrc.ed.ac.uk/maptask/). Annotation is performed in Praat. The sound files are segmented into prosodic phrases, words, and syllables, always to the nearest zero-crossing in the waveform. It is supplied, in seven separate interval tiers, with an orthographical transcription, detailed part-of-speech tags, simplified part-of-speech tags, a phonological transcription, a broad phonetic transcription, the pitch relation between each stressed and post-tonic syllable, the phrasal intonation, and an empty tier for comments.

Details

Paper ID
lrec2006-main-001
Pages
N/A
BibKey
gronnum-2006-danpass
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-2-4
Conference
Fifth International Conference on Language Resources and Evaluation
Location
Genoa, Italy
Date
24 May 2006 26 May 2006

Authors

  • NG

    Nina Grønnum

Links