Back to MWE 2024
LREC-COLING 2024workshop

Overcoming Early Saturation on Low-Resource Languages in Multilingual Dependency Parsing

Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024

DOI:10.63317/57hfe6yoj2ms

Abstract

UDify is a multilingual and multi-task parser fine-tuned on mBERT that achieves remarkable performance in high-resource languages. However, the performance saturates early and decreases gradually in low-resource languages as training proceeds. This work applies a data augmentation method and conducts experiments on seven few-shot and four zero-shot languages. The unlabeled attachment scores were improved on the zero-shot languages dependency parsing tasks, with the average score rising from 67.1% to 68.7%. Meanwhile, dependency parsing tasks for high-resource languages and other tasks were hardly affected. Experimental results indicate the data augmentation method is effective for low-resource languages in a multilingual dependency parsing.

Details

Paper ID
lrec2024-ws-mwe-10
Pages
pp. 63-69
BibKey
mao-etal-2024-overcoming
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • JM

    Jiannan Mao

  • CD

    Chenchen Ding

  • HK

    Hour Kaing

  • HT

    Hideki Tanaka

  • MU

    Masao Utiyama

  • TM

    Tadahiro Matsumoto.

Links