A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

Phonetic segmentation is the process of splitting speech into distinct phonetic units. Human experts routinely perform this task manually by analyzing auditory and visual cues using analysis software, which is an extremely time-consuming process. Methods exist for automatic segmentation, but these are not always accurate enough. In order to improve automatic segmentation, we need to model it as close to the manual segmentation as possible. This corpus is an effort to capture the human segmentation behavior by recording experts performing a segmentation task. We believe that this data will enable us to highlight the important aspects of manual segmentation, which can be used in automatic segmentation to improve its accuracy.

Resources

Details

Paper ID

lrec2018-main-675

Pages

N/A

DOI

10.63317/48ccvuqdcrda

BibKey

khan-etal-2018-multimodal

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

AK
Arif Khan
IS
Ingmar Steiner
YS
Yusuke Sugano
AB
Andreas Bulling
RM
Ross Macdonald

Links

URL

DOI