Which languages are "hot", and which are "cool"? Using Universal Dependencies for large-scale comparisons of subject expression
Proceedings of the Ninth Workshop on Universal Dependencies (UDW 2026)
Abstract
This study uses Universal Dependencies to investigate subject omission across fifty-six news corpora and twenty geographic varieties of English. Building on McLuhan’s "hot–cool" distinction, Hall’s LC–HC continuum, and Bisang’s notion of overt vs. hidden complexity, it tests whether subject omission rates reflect degrees of contextual reliance. The results broadly support these theories: low-context, "hot" languages, such as German, Dutch and Swedish, show low omission rates, while high-context, "cool" languages, such as Japanese, Korean and Chinese, show higher rates. English behaves as a "hot" language but exhibits internal variation across varieties, with Southeast Asian varieties exhibiting more omission than African ones. The study provides large-scale quantitative evidence while highlighting the need for further theoretical and methodological refinement, particularly regarding the role of word order and agreement.