HomeLREC 2020WorkshopsTRAClrec2020-ws-trac-17
Back to TRAC 2020
LREC 2020workshop

FlorUniTo@TRAC-2: Retrofitting Word Embeddings on an Abusive Lexicon for Aggressive Language Detection

Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying

DOI:10.63317/2u7oh6j5i9kw

Abstract

This paper describes our participation to the TRAC-2 Shared Tasks on Aggression Identification. Our team, FlorUniTo, investigated the applicability of using an abusive lexicon to enhance word embeddings towards improving detection of aggressive language. The embeddings used in our paper are word-aligned pre-trained vectors for English, Hindi, and Bengali, to reflect the languages in the shared task data sets. The embeddings are retrofitted to a multilingual abusive lexicon, HurtLex. We experimented with an LSTM model using the original as well as the transformed embeddings and different language and setting variations. Overall, our systems placed toward the middle of the official rankings based on weighted F1 score. However, the results on the development and test sets show promising improvements across languages, especially on the misogynistic aggression sub-task.

Details

Paper ID
lrec2020-ws-trac-17
Pages
pp. 106-112
BibKey
koufakou-etal-2020-florunito
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying
Location
undefined, undefined
Date
11 May 2020 16 May 2020

Authors

  • AK

    Anna Koufakou

  • VB

    Valerio Basile

  • VP

    Viviana Patti

Links