SAINT: Multilingual Span-Level Interpretability for Sentiment Analysis
Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages
Abstract
We investigate multilingual sentiment analysis and interpretability across high- and low-resource languages, focusing on Amharic, English, German, and Hausa. Our study evaluates encoder-only transformer models for both sequence-level sentiment classification and token-level attribution using Captum. Additionally, we assess zero- and few-shot decoder-only models for sequence-level sentiment prediction. Our results show that few-shot decoder-only models outperform encoder-only models on token-level sentiment classification in most languages, with the exception of Hausa, where a multilingual encoder-based model leads. For sequence-level sentiment classification, encoder-only models generally achieve strong performance across most languages, but decoder-only models are highly competitive, and may even surpass encoders, in the high-resource settings (German, English) and low-resource scenarios, depending on the prompting strategy. These findings highlight the utility of combining fine-tuned transformer models with prompt- based large language models to build interpretable sentiment analysis systems across both low- and high-resource languages. The SAINT dataset, annotation guideline, and evaluation scripts can be found at https://github.com/uhh-hcds/SAINT.