HomeLREC 2022WorkshopsCLTWlrec2022-ws-cltw-01
Back to CLTW 2022
LREC 2022workshop

Multilingual Abstract Meaning Representation for Celtic Languages

Proceedings of the 4th Celtic Language Technology Workshop within LREC2022

DOI:10.63317/5bgp9wfifjrj

Abstract

Deep Semantic Parsing into Abstract Meaning Representation (AMR) graphs has reached a high quality with neural-based seq2seq approaches. However, the training corpus for AMR is only available for English. Several approaches to process other languages exist, but only for high resource languages. We present an approach to create a multilingual text-to-AMR model for three Celtic languages, Welsh (P-Celtic) and the closely related Irish and Scottish-Gaelic (Q-Celtic). The main success of this approach are underlying multilingual transformers like mT5. We finally show that machine translated test corpora unfairly improve the AMR evaluation for about 1 or 2 points (depending on the language).

Details

Paper ID
lrec2022-ws-cltw-01
Pages
pp. 1-6
BibKey
heinecke-shimorina-2022-multilingual
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 4th Celtic Language Technology Workshop within LREC2022
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • JH

    Johannes Heinecke

  • AS

    Anastasia Shimorina

Links