HomeLREC 2026WorkshopsNAKBANLPlrec2026-ws-nakbanlp-21
Back to NAKBANLP 2026
LREC 2026workshop

Ketaba-OCR at AR-MS NakbaNLP 2026: Efficient Adaptation of Vision-Language Models for Handwritten Recognition

Proceedings of the 2nd International Workshop on Nakba Narratives as Language Resources @ LREC 2026

DOI:10.63317/3inc2znes52o

Abstract

This paper presents Ketaba-OCR-LoRA, a system developed for the NakbaNLP 2026 Shared Task on Arabic Manuscript Understanding (Subtask 2), which targets the transcription of the historically significant Omar Al-Saleh Memoir Collection written in Ruq’ah and Naskh scripts. We propose a parameter-efficient adaptation of a publicly available pretrained Arabic-English Handwritten Text Recognition (HRT) model, originally trained on handwritten corpora including the Muharaf dataset. Instead of adapting general Vision-Language Models from scratch, we fine-tune the HRT backbone using Low-Rank Adaptation (LoRA) and 4-bit quantization (QLoRA), reducing memory requirements from 40GB to approximately 8GB. Our final submission combines multiple model variants through a novel Linear+Boost weighted ensemble strategy. Our approach achieves a CER of 0.0819 and WER of 0.2588 on the blind test set (per-line evaluation), ranking 1st on per-line evaluation; on the official corpus-wide leaderboard, we rank 3rd (CER 0.0938, WER 0.2996). This work demonstrates that specialized pretrained HRT models substantially outperform general-purpose Vision-Language Models for Arabic manuscript transcription, and that parameter-efficient fine-tuning provides a practical and reproducible approach for low-resource cultural heritage digitization.

Details

Paper ID
lrec2026-ws-nakbanlp-21
Pages
pp. 160-170
BibKey
barmandah-etal-2026-ketaba
Editors
Mustafa Jarrar, Mo El-Haj, Amal Haddad, Serin Atiani, Shadi Abudalfa, Terry Regier, Paul Rayson, Khalil Sima’an, Camille Mansour
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 2nd International Workshop on Nakba Narratives as Language Resources @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • HB

    Hassan Barmandah

  • FE

    Fatimah Emad Eldin

  • KA

    Khloud Al Jallad

  • ON

    Omer Nacar

Links