HCMUS_TheFangs at NakbaVirality Shared Task: The Audience is the Message: Escaping the Deep Learning Trap in Conflict-Domain Virality Prediction
Proceedings of the 2nd International Workshop on Nakba Narratives as Language Resources @ LREC 2026
Abstract
We present HCMUS_TheFangs’s system for the Nakba-NLP 2026 Virality Shared Task, which achieves Rank #1 on the final leaderboard with a test Macro-F1 of 0.7062, placing first among all competing teams. Our winning system is deliberately simple: a single Community Target Encoding feature - the smoothed historical virality rate of the posting subreddit - combined with TF-IDF text features and an XGBoost classifier. This design emerged from a hard-won insight: virality in conflict reporting is determined not by what is posted but by where it is posted. We spend the majority of this paper showing why this holds. Through 18 ablation experiments we trace the journey from deep learning failure (a cross-attention fusion model scoring 0.2935) to sociological feature engineering. We demonstrate that in conflict domains, deep learning overfits, promotional hashtags negatively correlate with engagement, and visual features are context-dependent modifiers rather than independent signals. Our findings challenge standard practices in multimodal classification and offer a roadmap for predicting virality in highly polarized, community-driven social media environments.