Back to Main Conference 2018
LREC 2018main

Utilizing Large Twitter Corpora to Create Sentiment Lexica

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/33kcb4ixr4hd

Abstract

The paper describes an automatic Twitter sentiment lexicon creator and a lexicon-based sentiment analysis system. The lexicon creator is based on a Pointwise Mutual Information approach, utilizing 6.25 million automatically labeled tweets and 103 million unlabeled, with the created lexicon consisting of about 3 000 entries. In a comparison experiment, this lexicon beat a manually annotated lexicon. A sentiment analysis system utilizing the created lexicon, and handling both negation and intensification, produces results almost on par with sophisticated machine learning-based systems, while significantly outperforming those in terms of run-time.

Details

Paper ID
lrec2018-main-447
Pages
N/A
BibKey
fredriksen-etal-2018-utilizing
Editors
Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 - 12 May 2018

Authors

  • VF

    Valerij Fredriksen

  • BJ

    Brage Jahren

  • BG

    Björn Gambäck

Links