Back to Main Conference 2024
LREC-COLING 2024main

JL-Hate: An Annotated Dataset for Joint Learning of Hate Speech and Target Detection

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/2tt6z9mkkfk7

Abstract

The detection of hate speech is a subject extensively explored by researchers, and machine learning algorithms play a crucial role in this domain. The existing resources mostly focus on text sequence classification for the task of hate speech detection. However, the target of hateful content is another dimension that has not been studied in details due to the lack of data resources. In this study, we address this gap by introducing a novel tweet dataset for the task of joint learning of hate speech detection and target detection, called JL-Hate, for the tasks of sequential text classification and token classification, respectively. The JL-Hate dataset consists of 1,530 tweets divided equally in English and Turkish languages. Leveraging this dataset, we conduct a series of benchmark experiments. We utilize a joint learning model to concurrently perform sequence and token classification tasks on our data. Our experimental results demonstrate consistent performance with the prevalent studies, both in sequence and token classification tasks.

Details

Paper ID
lrec2024-main-0834
Pages
pp. 9543-9553
BibKey
buyukdemirci-etal-2024-jl
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • KB

    Kaan Büyükdemirci

  • IK

    Izzet Emre Kucukkaya

  • Eren Ölmez

  • CT

    Cagri Toraman

Links