Back to Main Conference 2024
LREC-COLING 2024main

Exploring and Mitigating Shortcut Learning for Generative Large Language Models

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/4z4vjvv2kxaj

Abstract

Recent generative large language models (LLMs) have exhibited incredible instruction-following capabilities while keeping strong task completion ability, even without task-specific fine-tuning. Some works attribute this to the bonus of the new scaling law, in which the continuous improvement of model capacity yields emergent capabilities, e.g., reasoning and universal generalization. However, we point out that recent LLMs still show shortcut learning behavior, where the models tend to exploit spurious correlations between non-robust features and labels for prediction, which might lead to overestimating model capabilities. LLMs memorize more complex spurious correlations (i.e., task ↔ feature ↔ label) compared with that learned from previous pre-training and task-specific fine-tuning paradigm (i.e., feature ↔ label). Based on our findings, we propose FSLI, a framework for encouraging LLMs to Forget Spurious correlations and Learn from In-context information. Experiments on three tasks show that FSFI can effectively mitigate shortcut learning. Besides, we argue not to overestimate the capabilities of LLMs and conduct evaluations in more challenging and complete test scenarios.

Details

Paper ID
lrec2024-main-0602
Pages
pp. 6883-6893
BibKey
sun-etal-2024-exploring
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • ZS

    Zechen Sun

  • YX

    Yisheng Xiao

  • JL

    Juntao Li

  • YJ

    Yixin Ji

  • WC

    Wenliang Chen

  • MZ

    Min Zhang

Links