Back to Main Conference 2022
LREC 2022main

Towards Speaker Verification for Crowdsourced Speech Collections

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

DOI:10.63317/3gd3v95paiwa

Abstract

Crowdsourcing the collection of speech provides a scalable setting to access a customisable demographic according to each dataset’s needs. The correctness of speaker metadata is especially relevant for speaker-centred collections - ones that require the collection of a fixed amount of data per speaker. This paper identifies two different types of misalignment present in these collections: Multiple Accounts misalignment (different contributors map to the same speaker), and Multiple Speakers misalignment (multiple speakers map to the same contributor). Based on state-of-the-art approaches to Speaker Verification, this paper proposes an unsupervised method for measuring speaker metadata plausibility of a collection, i.e., evaluating the match (or lack thereof) between contributors and speakers. The solution presented is composed of an embedding extractor and a clustering module. Results indicate high precision in automatically classifying contributor alignment (>0.94).

Details

Paper ID
lrec2022-main-637
Pages
pp. 5929-5937
BibKey
mendonca-etal-2022-towards
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-38-2
Conference
Thirteenth Language Resources and Evaluation Conference
Location
Marseille, France
Date
20 June 2022 25 June 2022

Authors

  • JM

    John Mendonca

  • RC

    Rui Correia

  • ML

    Mariana Lourenço

  • JF

    João Freitas

  • IT

    Isabel Trancoso

Links