Back to Main Conference 2024
LREC-COLING 2024main

A Comparative Study of Explicit and Implicit Gender Biases in Large Language Models via Self-evaluation

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/3fpm9f8exw68

Abstract

While extensive work has examined the explicit and implicit biases in large language models (LLMs), little research explores the relation between these two types of biases. This paper presents a comparative study of the explicit and implicit biases in LLMs grounded in social psychology. Social psychology distinguishes between explicit and implicit biases by whether the bias can be self-recognized by individuals. Aligning with this conceptualization, we propose a self-evaluation-based two-stage measurement of explicit and implicit biases within LLMs. First, the LLM is prompted to automatically fill templates with social targets to measure implicit bias toward these targets, where the bias is less likely to be self-recognized by the LLM. Then, the LLM is prompted to self-evaluate the templates filled by itself to measure explicit bias toward the same targets, where the bias is more likely to be self-recognized by the LLM. Experiments conducted on state-of-the-art LLMs reveal human-like inconsistency between explicit and implicit occupational gender biases. This work bridges a critical gap where prior studies concentrate solely on either explicit or implicit bias. We advocate that future work highlight the relation between explicit and implicit biases in LLMs.

Details

Paper ID
lrec2024-main-0017
Pages
pp. 186-198
BibKey
zhao-etal-2024-comparative
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • YZ

    Yachao Zhao

  • BW

    Bo Wang

  • YW

    Yan Wang

  • DZ

    Dongming Zhao

  • XJ

    Xiaojia Jin

  • JZ

    Jijun Zhang

  • RH

    Ruifang He

  • YH

    Yuexian Hou

Links