Back to DMR 2024
LREC-COLING 2024workshop

Adjudicating LLMs as PropBank Adjudicators

Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

DOI:10.63317/2f3kspgxkjxm

Abstract

We evaluate the ability of large language models (LLMs) to provide PropBank semantic role label annotations across different realizations of the same verbs in transitive, intransitive, and middle voice constructions. In order to assess the meta-linguistic capabilities of LLMs as well as their ability to glean such capabilities through in-context learning, we evaluate the models in a zero-shot setting, in a setting where it is given three examples of another verb used in transitive, intransitive, and middle voice constructions, and finally in a setting where it is given the examples as well as the correct sense and roleset information. We find that zero-shot knowledge of PropBank annotation is almost nonexistent. The largest model evaluated, GPT-4, achieves the best performance in the setting where it is given both examples and the correct roleset in the prompt, demonstrating that larger models can ascertain some meta-linguistic capabilities through in-context learning. However, even in this setting, which is simpler than the task of a human in PropBank annotation, the model achieves only 48% accuracy in marking numbered arguments correctly. To ensure transparency and reproducibility, we publicly release our dataset and model responses.

Details

Paper ID
lrec2024-ws-dmr-12
Pages
pp. 112-123
BibKey
bonn-etal-2024-adjudicating
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • JB

    Julia Bonn

  • HT

    Harish Tayyar Madabushi

  • JH

    Jena D. Hwang

  • CB

    Claire Bonial

Links