LLM-Based Financial Sentiment Analysis in Arabic: Evidence from Saudi Markets

The 7th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT7) with 5 Shared Tasks

Abstract

Investor sentiment significantly influences financial markets, yet Arabic financial sentiment analysis remains limited by linguistic complexity and scarce domain-specific resources. This paper presents an LLM-based framework for large-scale Arabic financial sentiment analysis tailored to the Saudi market. We construct an 84K-sample Arabic Financial Sentiment Corpus integrating official financial news and social media data. The proposed pipeline includes preprocessing, deduplication, entity linking, conditional summarization, and five-class sentiment labeling using a multi-model consensus strategy to enhance reliability. We benchmark multiple large language models against traditional lexicon-based and fine-tuned transformer baselines. GPT-5 achieves the strongest class-balanced performance (Macro-F1 = 0.829), substantially outperforming conventional approaches. For summarization, Allam demonstrates the best trade-off between quality, hallucination control, and cost efficiency. Additional analyses examine cost–quality trade-offs and the impact of summarization on sentiment consistency. The results establish new benchmarks for Arabic financial sentiment classification and demonstrate the effectiveness of scalable LLM-based pipelines for domain-specific Arabic NLP.