Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Not Gemma at AR-MS NakbaNLP 2026: Mubsir OCR: End-to-End Recognition of Arabic Handwritten Text
Paper Fields
Click the edit button next to a field to report a correction.
Not Gemma at AR-MS NakbaNLP 2026: Mubsir OCR: End-to-End Recognition of Arabic Handwritten Text
Historical Arabic handwritten OCR is difficult because of cursive script, fine diacritics, mixed numerals, and degraded media; classical segmentation pipelines compound errors, whereas end-to-end vision-language models can adapt when fine-tuned on in-domain data. We present Mubsir OCR, a systematic evaluation on the NAKBA dataset: an annotated set (15,962 training line crops and 2,095 val lines with ground truth, used for all nine experiments) and a separate blind AR-MS (Subtask 2) set (2,671 images; scores only via official submission). We compare external vs. in-house VLMs (Qwen2.5-VL 3B, Qwen3-VL-4B-Instruct, Gemma3), inference backends (vLLM/bf16 vs. HuggingFace/bf16), training length (16 vs. 32 epochs), and test-time preprocessing (CLAHE+unsharp). Best on the annotated val set: 8.59% CER / 25.87% WER (HuggingFace bf16); the same configuration attains 11.00% CER / 31.26% WER on the blind set. Domain-specific fine-tuning beats general-purpose checkpoints; preprocessing helps only marginally and is not recommended without train-time augmentation.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.