Linguistic Knowledge-Infused Fine-Tuning for Mitigating Gender Bias in Machine Translation
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Large Language Models (LLMs) achieve strong performance in machine translation (MT) but often encode gender bias, particularly when translating from non-gendered into gendered languages. This paper introduces a fine-tuning strategy to mitigate such bias in English-Spanish and English-Catalan translation. Using parameter-efficient LoRA fine-tuning, we apply linguistic knowledge infusion—a reasoning-based method that trains models to identify gendered referents and syntactic cues before generating translations. Experiments with Mistral–7B and Salamandrata–7B on MT-GenEval show that linguistically infused models improve gender accuracy by 15 percentage points and reduce gender gaps by 27 points in English-Spanish translation, with comparable trends for Catalan. Gains are strongest for Mistral, suggesting that explicit linguistic reasoning particularly benefits general-purpose LLMs. Overall, these results demonstrate that structured linguistic priors can enhance fairness and referential consistency in multilingual machine translation.