Back to Main Conference 2026
LREC 2026main

Multiway Parallel Corpus in Forced Migration Domain for Multilingual Machine Translation

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3gxsf4vr3pjb

Abstract

High-quality domain-specific parallel corpora play a significant role in improving the performance of machine translation (MT) and multilingual natural language processing (NLP) systems in a target domain. However, most existing multilingual parallel corpora focus on general-purpose data, and a majority of highly specialized domains such as forced migration are suffering from lack of multilingual data. In this work, we present a new high-quality 4-way parallel corpus in the forced migration domain. The corpus consists of human-translated journal articles from Forced Migration Review in English, French, Spanish, and Arabic. Our corpus contains data aligned at both document and sentence level in four languages and provides a clean and reliable 4-way parallel resource for multilingual research in forced migration. Using this dataset, we benchmark several open-weight large language models (LLMs), an open-weight multilingual MT system, online closed MT systems, and a closed LLM across 12 translation directions. We further leverage our corpus to improve the MT quality of a top-performing multilingual foundation model with two common domain adaptation approaches, fine-tuning and few-shot prompting. Our results demonstrate the effectiveness of our corpus in improving the translation performance of current models in the forced migration domain.

Details

Paper ID
lrec2026-main-384
Pages
pp. 4889-4901
BibKey
azadi-etal-2026-multiway
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • FA

    Fatemeh Azadi

  • SL

    Samuel Larkin

  • CL

    Chi-kiu Lo

Links