Semantic Capacity in Language Learners and LLMs: A Case Study of Quantifier Scope
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
This study investigates the semantic capacity of large language models (LLMs) through the lens of quantifier scope interpretation. Sentences containing multiple quantifiers often give rise to interpretive ambiguities, and the range of available readings can vary across languages. Adopting a cross-linguistic perspective, we examine how LLMs interpret quantifier scope in English and Chinese, using model-generated probabilities to assess the relative likelihood of competing interpretations. Human similarity (HS) scores were used to quantify the extent to which LLMs emulate human performance across language groups. Results reveal that most LLMs prefer the surface scope interpretations, aligning with human tendencies, while only some differentiate between English and Chinese in the inverse scope preferences, reflecting human-similar patterns. HS scores highlight variability in LLMs’ approximation of human behavior, but their overall potential to align with humans is notable. Linguistic identity, instantiated through monolingual and bilingual personas of English or Chinese, was found to influence LLM behavior. Differences in model architecture, scale, and particularly models’ pre-training data language background, significantly influence how closely LLMs approximate human quantifier scope interpretations.