Back to Main Conference 2024
LREC-COLING 2024main

Your Stereotypical Mileage May Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/2wusz2w6c8aa

Abstract

Warning: This paper contains explicit statements of offensive stereotypes which may be upsetting The study of bias, fairness and social impact in Natural Language Processing (NLP) lacks resources in languages other than English. Our objective is to support the evaluation of bias in language models in a multilingual setting. We use stereotypes across nine types of biases to build a corpus containing contrasting sentence pairs, one sentence that presents a stereotype concerning an underadvantaged group and another minimally changed sentence, concerning a matching advantaged group. We build on the French CrowS-Pairs corpus and guidelines to provide translations of the existing material into seven additional languages. In total, we produce 11,139 new sentence pairs that cover stereotypes dealing with nine types of biases in seven cultural contexts. We use the final resource for the evaluation of relevant monolingual and multilingual masked language models. We find that language models in all languages favor sentences that express stereotypes in most bias categories. The process of creating a resource that covers a wide range of language types and cultural settings highlights the difficulty of bias evaluation, in particular comparability across languages and contexts.

Details

Paper ID
lrec2024-main-1545
Pages
pp. 17764-17769
BibKey
fort-etal-2024-stereotypical
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • KF

    Karen Fort

  • LA

    Laura Alonso Alemany

  • LB

    Luciana Benotti

  • JB

    Julien Bezançon

  • CB

    Claudia Borg

  • MB

    Marthese Borg

  • YC

    Yongjian Chen

  • FD

    Fanny Ducel

  • YD

    Yoann Dupont

  • GI

    Guido Ivetta

  • ZL

    Zhijian Li

  • MM

    Margot Mieskes

  • MN

    Marco Naguib

  • YQ

    Yuyan Qian

  • MR

    Matteo Radaelli

  • WS

    Wolfgang S. Schmeisser-Nieto

  • ER

    Emma Raimundo Schulz

  • TS

    Thiziri Saci

  • SS

    Sarah Saidi

  • JT

    Javier Torroba Marchante

  • SX

    Shilin Xie

  • SZ

    Sergio E. Zanotto

  • AN

    Aurélie Névéol

Links