Fruitcakes and Cupcakes Emerging from Noise: The ComposiGen Dataset of Compounds and Their Compositionality

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

Compounds are a complex linguistic phenomenon, as variation in their degree of compositionality often makes their interpretation non-straightforward. We consider the task of visual-linguistic compositionality prediction for English noun-noun compounds, i.e., predicting the degrees to which a compound’s meaning is predictable from its constituents. We introduce a new dataset, *ComposiGen*, which provides constituent-specific human-elicited compositionality ratings for compounds of different concreteness categories, and includes generated visual representations for both compounds and their constituents. To enable controlled comparisons, we structure *ComposiGen* such that head constituents are shared across multiple compounds (e.g., *wedding cake*, *cup cake*). We suggest a novel parameter-based approach leveraging constituent-to-compound image transformations to predict different degrees of visual constituent contributions to compound meaning. While our novel approach requires further exploration for validation, our overall results show that the generated images, in particular in combination with text, provide valuable information, and that simple late fusion outperforms multimodal transformers. Taken together, our findings highlight a promising avenue for future research on more efficient multimodal models for compositionality prediction. Our novel dataset offers a rich resource for future in-depth research, including the exploration of visual, constituent-based compound formation.