HomeLREC 2022WorkshopsCLTWlrec2022-ws-cltw-03
Back to CLTW 2022
LREC 2022workshop

Creation of an Evaluation Corpus and Baseline Evaluation Scores for Welsh Text Summarisation

Proceedings of the 4th Celtic Language Technology Workshop within LREC2022

DOI:10.63317/2m3xd66svknn

Abstract

As part of the effort to increase the availability of Welsh digital technology, this paper introduces the first human vs metrics Welsh summarisation evaluation results and dataset, which we provide freely for research purposes to help advance the work on Welsh summarisation. The system summaries were created using an extractive graph-based Welsh summariser. The system summaries were evaluated by both human and a range of ROUGE metric variants (e.g. ROUGE 1, 2, L and SU4). The summaries and evaluation results will serve as benchmarks for the development of summarisers and evaluation metrics in other minority language contexts.

Details

Paper ID
lrec2022-ws-cltw-03
Pages
pp. 14-21
BibKey
el-haj-etal-2022-creation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 4th Celtic Language Technology Workshop within LREC2022
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • ME

    Mahmoud El-Haj

  • IE

    Ignatius Ezeani

  • JM

    Jonathan Morris

  • DK

    Dawn Knight

Links