Balancing the Scales: Reinforcement Learning for Fair Classification

Proceedings of the Second Workshop of Identity Aware AI

Abstract

Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent approaches have shifted towards algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its ability to learn through interaction and adjust reward functions to encourage desired behaviors, presents a promising approach in this domain. In this paper, we conduct an exploratory evaluation of RL for addressing bias in imbalanced classification by scaling the reward function. We employ the contextual multi-armed bandit framework, adapt three popular RL algorithms, and conduct an extensive empirical evaluation of their relative strengths and limitations. Through this analysis, we contribute meaningful evidence to the ongoing debate between algorithmic and representational fairness approaches.