Back to Main Conference 2018
LREC 2018main
The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abstract
The Boarnsterhim Corpus consists of 250 hours of speech in both West Frisian and Dutch by the same sample of bilingual speakers. The corpus contains original recordings from 1982-1984 and a replication study recorded 35 years later. The data collection spans speech of four generations, and combines panel and trend data. This paper describes the Boarnsterhim Corpus halfway the project which started in 2016 and describes the way it was collected, the annotations, potential use, and the envisaged tools and end-user web application.