Back to Main Conference 2026
LREC 2026main

Contrastively Pre-trained Event Embeddings with Schema-free LLM Annotations

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3sezhi63dcqv

Abstract

Event extraction is a notoriously challenging problem, among others due to the scarcity of suitable training data. Moreover, event-centric knowledge bases are not available for most domains, making traditional distant supervision strategies difficult to implement. In this paper, we evaluate the potential of using LLM-generated annotations as an alternative distant supervision signal. Specifically, we create a synthetically labelled event extraction corpus, using an LLM to identify event triggers and arguments, and to provide corresponding free-text descriptions. We then pre-train event embedding models on this corpus using a contrastive loss, before fine-tuning them in the usual way. We empirically show the effectiveness of this approach.

Details

Paper ID
lrec2026-main-591
Pages
pp. 7457-7478
BibKey
mtumbuka-etal-2026-contrastively
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • FM

    Frank Mtumbuka

  • SS

    Steven Schockaert

Links