HomeLREC 2026WorkshopsCASlrec2026-ws-cas-10
Back to CAS 2026
LREC 2026workshop

Annotation Matters: Resolving Cross-Corpus Performance Drops in Hebrew Offensive Language Detection

Proceedings of Computational Affective Science (CAS) @ LREC 2026

DOI:10.63317/3fss6oc5mono

Abstract

Cross-dataset generalization remains a major challenge in offensive language detection, especially for culturally sensitive languages such as Hebrew. A large Hebrew dataset introduced in prior work (citation omitted for double-blind review) was annotated via a taxonomy-grounded, prompt-guided LLM protocol and achieved strong in-domain results. However, performance degraded sharply on two external Hebrew corpora. We investigate whether this degradation reflects domain shift or annotation shift, i.e., differences in how offensiveness is operationalized across datasets. Using the same prompt framework and a dual-LLM agreement procedure, we re-annotate both external corpora and quantify label divergence. We observe substantial mismatch between the original and new annotations, consistent with the view that offensiveness is not objective but depends on cultural context, discourse conventions, political framing, and the interpretation of irony. Evaluating models against the new labels yields markedly improved performance, and fine-tuning with the new external labels further improves results. Overall, our findings suggest that cross-dataset failure in affective NLP tasks may often be driven by annotation mismatch rather than domain adaptation limitations, highlighting the importance of annotation validity and culturally grounded labeling protocols.

Details

Paper ID
lrec2026-ws-cas-10
Pages
pp. 116-124
BibKey
bergerhefetz-etal-2026-annotation
Editors
Christopher Bagdon, Krishnapriya Vishnubhotla, Kristen A. Lindquist, Lyle Ungar, Roman Klinger, Saif M. Mohammad
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Computational Affective Science (CAS) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • GB

    Gili Berger Hefetz

  • YS

    Yossef Haim Shrem

  • NV

    Natalia Vanetik

  • CL

    Chaya Liebeskind

Links