Generating a Gold Standard for a Swedish Sentiment Lexicon

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

There is an increasing demand for multilingual sentiment analysis, and most work on sentiment lexicons is still carried out based on English lexicons like WordNet. In addition, many of the non-English sentiment lexicons that do exist have been compiled by (machine) translation from English resources, thereby arguably obscuring possible language-specific characteristics of sentiment-loaded vocabulary. In this paper we describe the creation of a gold standard for the sentiment annotation of Swedish terms as a first step towards the creation of a full-fledged sentiment lexicon for Swedish -- i.e., a lexicon containing information about \emph{prior} sentiment (also called polarity) values of lexical items (words or disambiguated word senses), along a scale negative--positive. We create a gold standard for sentiment annotation of Swedish terms, using the freely available SALDO lexicon and the Gigaword corpus. For this purpose, we employ a multi-stage approach combining corpus-based frequency sampling and two stages of human annotation: direct score annotation followed by Best-Worst Scaling. In addition to obtaining a gold standard, we analyze the data from our process and we draw conclusions about the optimal sentiment model.

Resources

Details

Paper ID

lrec2018-main-426

Pages

N/A

DOI

10.63317/439usxg8i7vq

BibKey

rouces-etal-2018-generating

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

JR
Jacobo Rouces
NT
Nina Tahmasebi
LB
Lars Borin
SR
Stian Rødven Eide

Links

URL

DOI