From Behavior to Geometry: A Causal and Geometric Analysis of LoRA-Based Domain Adaptation
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Parameter-efficient fine-tuning with Low-Rank Adaptation (LoRA) often improves a large language model’s in-domain performance at the cost of cross-domain generalization. We investigate the mechanistic basis for this trade-off, asking whether LoRA creates new discriminative directions in representation space (emergence) or merely reshapes pre-existing ones. Using a Word Sense Disambiguation testbed, we couple controlled behavioral evaluation with causal localization and geometric diagnostics. We find LoRA learns new, spatially localized discriminative directions in the middle layers of the network, focused at token positions critical for the task. This "subspace extension" account explains why LoRA-tuned models excel on in-domain data but struggle to transfer. As a proof of concept, we introduce a mechanistically informed LoRA configuration that concentrates capacity in the identified layers, promotes rank diversity, and applies light answer-token calibration. Without increasing training budget, it yields consistent improvements in both in- and cross-domain settings, demonstrating that mechanistic insight can guide more efficient adaptation.