HomeLREC 2026WorkshopsOSACTlrec2026-ws-osact-13
Back to OSACT 2026
LREC 2026workshop

On LLM Prompting Techniques for Arabic Language Arithmetic Reasoning

The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks

DOI:10.63317/52s6fyvjghcs

Abstract

Math word problems (MWPs) require complex reasoning to extract mathematical relationships from textual descriptions. While Large Language Models (LLMs) have shown remarkable performance on English mathematical reasoning tasks, their effectiveness on Arabic MWPs remains largely unexplored. This paper introduces three Arabic datasets (AGSM8K, Qudurat, and ArabicMWPs) and evaluates six LLMs using three prompting techniques: Manual Chain-of-Thought (CoT), Zero-shot CoT, and Self-consistency. Performance is assessed using accuracy and BERTScore metrics (precision, recall, F1-score). Our findings demonstrate that GPT-4o with Self-consistency achieves the highest accuracy of 97.65% on AGSM8K. It also obtains a precision of 71.94%, a recall of 71.31%, and an F1-score of 71.50%. The Arabic-specific LLM ALLaM achieves 84.41% accuracy on ArabicMWPs and 43.97% on AGSM8K. Fine-tuning experiments are further conducted on models using Arabic mathematical data. This work addresses the critical gap in Arabic mathematical reasoning resources and provides insights for developing Arabic-capable AI systems. Prompt-engineering methods combined with LLMs are regarded as a strong approach for advancing education and scientific research in solving Arabic mathematical problems.

Details

Paper ID
lrec2026-ws-osact-13
Pages
pp. 106-114
BibKey
alenezi-etal-2026-llm
Editors
Hend Al-Khalifa, Mo El-Haj, Saad Ezzini
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • RA

    Reem Alenezi

  • AS

    Ayed Atallah Salman

Links