Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Evaluating Discriminability of Vision-Language Models
Paper Fields
Click the edit button next to a field to report a correction.
Evaluating Discriminability of Vision-Language Models
We study the discriminative ability of vision-language models (VLMs). This ability refers to processing information by distinguishing key details from unnecessary or redundant parts to achieve specific goals. It is vital for the practical use of VLMs in applications like visual chatbots. Whereas recent VLMs have shown decent performance on various multimodal capabilities, their discriminative ability has not been thoroughly explored to date. To this end, we construct DiscriBench to evaluate the discriminability of VLMs in various daily life activities. We carefully design the dataset to require distinguishing information in both vision and language modalities, and semi-manually craft questions in English and Japanese, making them solvable without relying on external knowledge or expertise. Experimental results demonstrate a large performance gap (14.0 to 69.3 points) between humans and existing VLMs in discriminability, where humans can solve the task with an accuracy of 90% or higher. By reducing the difficulty of discriminability, our ablation studies elucidate that vision encoders cannot distinguish visual details well, given generally similar but partially different images. Besides, we observe that VLMs show inconsistent inference between modalities. We will publish DiscriBench (1,200 samples) to foster research in this direction.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.