Comparative Study of Machine Learning and Transformer-Based Approaches for Arabic Politeness Detection at AdabEval 2026
The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks
Abstract
This paper describes our system submitted to the OSACT7 AdabEval shared task on Arabic politeness detection (TaskA). The task requires classifying Arabic texts into three categories: Polite, Impolite, and Neutral. We systematically explore multiple approaches, progressing from classical machine learning baselines using pre-trained embeddings to fine-tuned transformer models. Our best system leverages MARBERT, a transformer model pre-trained on one billion Arabic tweets, fine-tuned with Focal Loss to handle the significant class imbalance present in the dataset (70% Neutral). We additionally experiment with hybrid approaches combining fine-tuned embeddings with gradient-boosted classifiers and ensemble methods. Our best single model achieves a macro F1 score of 0.84 and an accuracy of 0.90 on the validation set, substantially outperforming classical ML baselines (F1 = 0.42).