Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
This paper presents the Arabic Women and Society Corpus, a ten-year collection of 252,487 public Arabic Facebook posts related to women’s empowerment and social wellbeing. The corpus was collected from 51,660 pages across 77 countries between 2014 and 2024, resulting in more than 267 million user interactions. Each post includes engagement metrics such as shares, comments, and emotional reactions, providing a unique view of audience sentiment and social attention. The data were processed using an automated pipeline with language identification, normalization, and metadata cleaning to ensure reliability and reproducibility. The corpus enables large-scale analysis of gender discourse, social reform, and emotional engagement across Arabic dialects. It supports research in Arabic natural language processing, computational social science, and digital communication studies. The dataset and accompanying documentation will be released publicly for research use under an open license.