Back to OSACT 2024
LREC-COLING 2024workshop

AraT5-MSAizer: Translating Dialectal Arabic to MSA

Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024

DOI:10.63317/26nxp6cqn8q2

Abstract

This paper outlines the process of training the AraT5-MSAizer model, a transformer-based neural machine translation model aimed at translating five regional Arabic dialects into Modern Standard Arabic (MSA). Developed for Task 2 of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools, the model attained a BLEU score of 21.79% on the test set associated with this task.

Details

Paper ID
lrec2024-ws-osact-16
Pages
pp. 124-129
BibKey
fares-2024-arat5
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • MF

    Murhaf Fares

Links