The Emergence of the Pragmatic Dimension in Instructed-LMs
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Instruction-tuning fundamentally transforms how language models process linguistic input and interact with the user. Through the lens of speech act theory, we investigate whether instruction-tuning causes models to shift from prioritizing syntactical form to pragmatic intent. We create a controlled dataset of 400 sentences systematically varying along two dimensions: syntactical structure (declarative vs. interrogative) and communicative intent (assertive vs. request). Using Principal Component Analysis on hidden state representations from Qwen2.5 (1.5B-7B) and models from two other families (Gemma3-1B, and Llama3.2-3B), we reveal a consistent pattern: base models cluster sentences by syntactical form, while instruction-tuned models reorganize representations around pragmatic intent. This syntactic-to-pragmatic shift occurs in middle layers, with declarative requests and interrogative requests—maximally separated in base models—becoming the most similar categories after instruction-tuning. The phenomenon explains how instruction-tuned models correctly interpret indirect speech acts, treating polite declaratives like ‘‘I’d appreciate corrections" as functionally equivalent to direct interrogatives. Our findings demonstrate that instruction-tuning teaches models to prioritize the communicative dimension over surface form, a fundamental reorganization consistent across model scales and architectures.