HomeLREC 2026WorkshopsOSACTlrec2026-ws-osact-38
Back to OSACT 2026
LREC 2026workshop

CasbAI at AraSentEval 2026: Robust Dialectal Arabic Sentiment Classification via Multi-Seed Ensembling and Data Augmentation.

The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks

DOI:10.63317/32cq3gywnhjs

Abstract

This paper describes the system we designed for our participation in the AraSentEval 2026 shared task on Arabic dialectal sentiment analysis. We propose a transformer-based approach relying on MARBERT combined with a multi-seed ensemble strategy and several optimization techniques. Our system integrates seven independently trained models with different random initializations and applies Stochastic Weight Averaging (SWA) to improve generalization. To address class imbalance, we augment the training data through dialectal synonym replacement, increasing the dataset size by 13.9% while preserving dialect distribution. In addition, we incorporate Test-Time Augmentation (TTA) and investigate the use of pseudo-labeling based on high-confidence predictions. We report our experiments on the official dataset covering Moroccan, Egyptian, Jordanian, and Saudi dialects, and analyze the contribution of each component through ablation experiments. Our system achieved a macro F1-score of 84.62% on the test set, ranking 3rd among 15 participating teams.

Details

Paper ID
lrec2026-ws-osact-38
Pages
pp. 278-283
BibKey
abdelaziz-etal-2026-casbai
Editors
Hend Al-Khalifa, Mo El-Haj, Saad Ezzini
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • ca

    chaima abdelaziz

  • KS

    KahinaHouda Saadaoui

  • FB

    Faiza BELBACHIR

  • LS

    Lynda Said Lhadj

Links