Back to RAIL 2024
LREC-COLING 2024workshop

Resources for Annotating Hate Speech in Social Media Platforms Used in Ethiopia: A Novel Lexicon and Labelling Scheme

Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024

DOI:10.63317/5o9k42gosm9f

Abstract

Hate speech on social media has proliferated in Ethiopia. To support studies aimed at investigating the targets and types of hate speech circulating in the Ethiopian context, we developed a new fine-grained annotation scheme that captures three elements of hate speech: the target (i.e., any groups with protected characteristics), type (i.e., the method of abuse) and nature (i.e., the style of the language used). We also developed a new lexicon of hate speech-related keywords in the four most prominent languages found on Ethiopian social media: Amharic, Afaan Oromo, English and Tigrigna. These keywords enabled us to retrieve social media posts (also in the same four languages) from three platforms (i.e., X, Telegram and Facebook), that are likely to contain hate speech. Experts in the Ethiopian context then manually annotated a sample of those retrieved posts, obtaining fair to moderate inter-annotator agreement. The resulting annotations formed the basis of a case study of which groups tend to be targeted by particular types of hate speech or by particular styles of hate speech language.

Details

Paper ID
lrec2024-ws-rail-13
Pages
pp. 115-123
BibKey
ibrahim-etal-2024-resources
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024
Location
undefined, undefined
Date
20 May 2024 25 May 2024

Authors

  • NI

    Nuhu Ibrahim

  • FM

    Felicity Mulford

  • ML

    Matt Lawrence

  • RB

    Riza Batista-Navarro

Links