How Well Do Large Language Models Reason in Under-Resourced Languages? Evidence from Vietnamese

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

DOI:10.63317/2a43bkurpywk

Abstract

Despite advancements in Large Language Models, reasoning benchmarks remain centered on high-resource languages, leaving languages like Vietnamese under-evaluated. In this study, we aim to address this gap by evaluating four models: PhoGPT (native), Vistral and VBD-Llama (adapted), and Llama-2 (English-centric), on commonsense reasoning and arithmetic reasoning. As Vietnamese benchmarks for these tasks are lacking, we adapt two analogy datasets from English to Vietnamese and construct two sequence datasets, ensuring a range of structural complexity and difficulty levels. We evaluate diverse prompting strategies, including Chain-of-Thought, role-playing guidance, cross-lingual prompting, and few-shot learning. Our results reveal a baseline proficiency in analogical and arithmetic reasoning among the models, with Vistral and Llama-2 outperforming other models in multiple tasks. The effects of Chain-of-Thought and contextual guidance are limited in Vietnamese, while cross-lingual prompting and few-shot learning show promising performance improvements. The findings underscore the feasibility of adapting benchmarks to less-resourced languages and provide insights into strengths and weaknesses in the performance of Vietnamese LLMs, suggesting directions for model improvements.

Resources

Details

Paper ID

lrec2026-ws-sigul-01

Pages

pp. 1-18

DOI

10.63317/2a43bkurpywk

BibKey

do-etal-2026-how

Editors

Atul Kr. Ojha, Sakriani Sakti, Claudia Soria, Maite Melero, John P. McCrae, Constantine Lignos, Chao-Hong Liu, German Rigau Claramunt, Georg Rehm

Publisher

European Language Resources Association (ELRA)

ISSN

N/A

ISBN

N/A

Workshop

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

Location

Palma, Mallorca, Spain

Date

11 - 16 May 2026

Authors

TD
Tuan Anh Do
JB
Jelke Bloem

Links

URL

DOI