Back to Main Conference 2026
LREC 2026main

Localizing Events in Space: Comparing Humans and AI Models

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/5hmw7fkruat3

Abstract

Understanding how Large Language Models (LLMs) and Text-to-Image models (T2Is) acquire and apply implicit spatial knowledge remains an open challenge. In this paper, we present a novel dataset and evaluation framework designed to probe event localization capabilities in both humans, LLMs and T2Is. Our dataset includes 134 sentence pairs derived from Flickr30k captions, where explicit location information is systematically removed via Abstract Meaning Representation (AMR) parsing and manual refinement. Using this dataset, we analyze the effects of location ablation on spatial reasoning across human annotators, LLMs, and T2Is. Results show that while humans maintain robust location inferences after ablation, LLMs exhibit degraded performance, particularly for semantically polysemous verbs. T2Is demonstrate similar limitations, often generating visually inconsistent spatial contexts when locative cues are missing. Our findings highlight the gap between human and LLMs and T2Is in recovering implicit situational knowledge and suggest future directions for improving spatial reasoning in multimodal AI systems. This dataset contribution work serves as a proof-of-concept for systematic evaluation of implicit spatial reasoning and paves the way for larger-scale studies.

Details

Paper ID
lrec2026-main-865
Pages
pp. 11072-11084
BibKey
kim-etal-2026-localizing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • DK

    Derrick Eui Gyu Kim

  • KL

    Kenneth Lai

  • JP

    James Pustejovsky

Links