Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in Vision–Language Models
Paper Fields
Click the edit button next to a field to report a correction.
Seeing the Other Side: Diagnostic Tasks for Viewpoint Reasoning in Vision–Language Models
Humans can integrate multiple visual perspectives and infer how an object appears from unseen sides. This study investigates whether Large Vision Language Models (LVLMs) exhibit a comparable ability for reference-grounded spatial reasoning. We propose two diagnostic tasks: Opposite-Side Reasoning, which determines whether two images show the same object from opposite viewpoints, and Viewpoint Identification, which predicts the viewpoint of a target image using a reference image and its label. An additional condition, Viewpoint Identification (no-ref), removes reference information to reveal cases solvable without it, distinguishing genuine reasoning from bias-driven shortcuts. Our evaluation shows that both open and proprietary LVLMs fall far short of human performance. Even state-of-the-art proprietary LVLMs with relatively high accuracy retain many correct answers when reference information is removed, suggesting that their success often relies on linguistic or dataset-driven priors rather than genuine reference-based reasoning. These findings indicate that current LVLMs have not yet achieved consistent, reference-grounded spatial reasoning. Our datasets in this work will be released on the Hugging Face Hub to support future research on multimodal viewpoint reasoning and spatial understanding.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.