Back to Main Conference 2016
LREC 2016main

Operational Assessment of Keyword Search on Oral History

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/2mj6wmppo6xn

Abstract

This project assesses the resources necessary to make oral history searchable by means of automatic speech recognition (ASR). There are many inherent challenges in applying ASR to conversational speech: smaller training set sizes and varying demographics, among others. We assess the impact of dataset size, word error rate and term-weighted value on human search capability through an information retrieval task on Mechanical Turk. We use English oral history data collected by StoryCorps, a national organization that provides all people with the opportunity to record, share and preserve their stories, and control for a variety of demographics including age, gender, birthplace, and dialect on four different training set sizes. We show comparable search performance using a standard speech recognition system as with hand-transcribed data, which is promising for increased accessibility of conversational speech and oral history archives.

Details

Paper ID
lrec2016-main-049
Pages
pp. 317-321
BibKey
salesky-etal-2016-operational
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • ES

    Elizabeth Salesky

  • JR

    Jessica Ray

  • WS

    Wade Shen

Links