Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages

Proceedings of the Thirteenth International Conference on Language Resources and Evaluation (LREC 2022)

Abstract

In conventional bilingual dictionary creation by using crowdsourcing, the main method is to ask multiple workers to translate the same words or sentences and take a majority vote. However, when this method is applied to the creation of bilingual dictionaries for low-resource languages with few speakers, many low-quality workers are expected to participate in the majority voting, which makes it difficult to maintain the quality of the evaluation by the majority voting. Therefore, we apply an effective aggregation method using a hyper question, which is a set of single questions, for quality control. Furthermore, to select high-quality workers, we design a task-allocation method based on the reliability of workers which is evaluated by their work results.

Resources

Details

Paper ID

lrec2022-main-709

Pages

pp. 6590-6596

DOI

10.63317/2qqqjtn8boo9

BibKey

chida-etal-2022-quality

Editors

Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis2020

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-38-2

Conference

Thirteenth Language Resources and Evaluation Conference

Location

Marseille, France

Date

20 - 25 June 2022

Authors

HC
Hiroki Chida
YM
Yohei Murakami
MP
Mondheera Pituxcoosuvarn

Links

URL

DOI