Back to Main Conference 2016
LREC 2016main

A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/5krpgnhso5km

Abstract

We present a new speech database containing 18.5 hours of annotated radio broadcasts in the Frisian language. Frisian is mostly spoken in the province Fryslan and it is the second official language of the Netherlands. The recordings are collected from the archives of Omrop Fryslan, the regional public broadcaster of the province Fryslan. The database covers almost a 50-year time span. The native speakers of Frisian are mostly bilingual and often code-switch in daily conversations due to the extensive influence of the Dutch language. Considering the longitudinal and code-switching nature of the data, an appropriate annotation protocol has been designed and the data is manually annotated with the orthographic transcription, speaker identities, dialect information, code-switching details and background noise/music information.

Details

Paper ID
lrec2016-main-739
Pages
pp. 4666-4669
BibKey
yilmaz-etal-2016-longitudinal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • EY

    Emre Yilmaz

  • MA

    Maaike Andringa

  • SK

    Sigrid Kingma

  • JD

    Jelske Dijkstra

  • Fv

    Frits van der Kuip

  • HV

    Hans Van de Velde

  • FK

    Frederik Kampstra

  • JA

    Jouke Algra

  • Hv

    Henk van den Heuvel

  • Dv

    David van Leeuwen

Links