Back to Main Conference 2024
LREC-COLING 2024main

Schema-based Data Augmentation for Event Extraction

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/4vrqn4gn7et5

Abstract

Event extraction is a crucial task for semantic understanding and structured knowledge construction. However, the expense of collecting and labeling data for training event extraction models is usually high. To address this issue, we propose a novel schema-based data augmentation method that utilizes event schemas to guide the data generation process. The event schemas depict the typical patterns of complex events and can be used to create new synthetic data for event extraction. Specifically, we sub-sample from the schema graph to obtain a subgraph, instantiate the schema subgraph, and then convert the instantiated subgraph to natural language texts. We conduct extensive experiments on event trigger detection, event trigger extraction, and event argument extraction tasks using two datasets (including five scenarios). The experimental results demonstrate that our proposed data-augmentation method produces high-quality generated data and significantly enhances the model performance, with up to 12% increase in F1 score compared to baseline methods.

Details

Paper ID
lrec2024-main-1253
Pages
pp. 14382-14392
BibKey
jin-ji-2024-schema
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • XJ

    Xiaomeng Jin

  • HJ

    Heng Ji

Links