HomeLREC 2026WorkshopsPOLITICALNLPlrec2026-ws-politicalnlp-26
Back to POLITICALNLP 2026
LREC 2026workshop

PReSS: An Automated Black-Box Framework for Evaluating Political Stance Stability in LLMs

Proceedings of the Second Workshop on Building Educational Applications Using NLP

DOI:10.63317/35d8pipu4ofv

Abstract

Existing evaluations of political bias in large language models (LLMs) typically classify outputs as left- or right-leaning. We extend this perspective by examining how ideological tendencies vary across topics and how consistently models maintain their positions, a property we refer to as stability. To capture this dimension, we propose PReSS (Political Response Stability under Stress), an automated black-box framework that evaluates LLMs by jointly considering model and topic context, categorizing responses into four stance types: stable-left, unstable-left, stable-right, and unstable-right. Applying PReSS to 9 widely used LLMs across 19 political topics reveals substantial variation in stance stability; for instance, a model that is left-leaning overall can exhibit stable-right behavior on certain topics. This highlights the importance of topic-aware and fine-grained evaluation of political ideologies of LLMs. Moreover, stability has practical implications for controlled generation and model alignment: interventions such as debiasing or ideology reversal should explicitly account for stance stability. Our empirical analyses reveal that when models are prompted or fine-tuned to adopt the opposite ideology, unstable topic stances are more likely to change, whereas stable ones resist modification. Thus, treating stability as a moderating factor provides a principled foundation for understanding, evaluating, and guiding interventions in politically sensitive model behavior.

Details

Paper ID
lrec2026-ws-politicalnlp-26
Pages
pp. 234-247
BibKey
kabir-etal-2026-press
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Second Workshop on Building Educational Applications Using NLP
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • SK

    Shariar Kabir

  • YD

    Yue Dong

  • KE

    Kevin Esterling

Links