SentiArabic: A Sentiment Analyzer for Standard Arabic
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Abstract
Sentiment analysis has been receiving increasing interest as it conveys valuable information in regard to people’s preferences and opinions. In this work, we present a sentiment analyzer that identifies the overall contextual polarity for Standard Arabic text. The contribution of this work is threefold. First, we modify and extend SLSA; a large-scale Sentiment Lexicon for Standard Arabic. Second, we build a sentiment corpus of Standard Arabic text tagged for its contextual polarity. This corpus represents the training, development and test sets for the proposed system. Third, we build a lightweight lexicon-based sentiment analyzer for Standard Arabic (SentiArabic). The analyzer does not require running heavy computations, where the link to the lexicon is carried out through a morphological lookup as opposed to conducting a rich morphological analysis, while the assignment of the sentiment is based on a simple decision tree that uses polarity scores as opposed to a more complex machine learning approach that relies on lexical information, while negation receives special handling. The analyzer is highly efficient as it achieves an F-score of 76.5% when evaluated on a blind test set, which is the highest results reported for that set, and an absolute 3.0% increase over a state-of-the-art system that uses deep-learning models.