Back to Main Conference 2026
LREC 2026main

Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2ry23e89ew5v

Abstract

Current Multimodal Large Language Models exhibit very strong performance for several demanding tasks. While commercial MLLMs deliver acceptable performance in low-resource languages, comparable results remain unattained within the open science community. In this paper, we aim to develop a strong MLLM for a low-resource language, namely Basque. For that purpose, we develop our own training and evaluation image-text datasets, leveraging state-of-the-art translation systems. Using two different Large Language Models as backbones, the Llama-3.1-Instruct model and a Basque-adapted variant called Latxa, we explore several data mixtures for training, encompassing Basque and English languages for both multimodal and text-only data. Evaluating our MLLMs for close-ended and open-ended generation tasks, we show that: i) low ratios of Basque multimodal data (around 20%) are already enough to obtain solid results on Basque benchmarks, and ii) contrary to expected, a Basque instructed backbone LLM is not required to obtain a strong MLLM in Basque. Additionally, we specify the optimal data mixture strategy, the effects of multimodal data in text-only tasks, and analyze evaluation approaches for open-ended generation tasks. Our results pave the way to develop MLLMs for other low-resource languages by openly releasing our resources.

Details

Paper ID
lrec2026-main-721
Pages
pp. 9172-9187
BibKey
arana-etal-2026-multimodal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • LA

    Lukas Arana

  • JE

    Julen Etxaniz

  • AS

    Ander Salaberria

  • GA

    Gorka Azkune

Links