Back to Main Conference 2016
LREC 2016main

PotTS: The Potsdam Twitter Sentiment Corpus

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/44pomzyskxnt

Abstract

In this paper, we introduce a novel comprehensive dataset of 7,992 German tweets, which were manually annotated by two human experts with fine-grained opinion relations. A rich annotation scheme used for this corpus includes such sentiment-relevant elements as opinion spans, their respective sources and targets, emotionally laden terms with their possible contextual negations and modifiers. Various inter-annotator agreement studies, which were carried out at different stages of work on these data (at the initial training phase, upon an adjudication step, and after the final annotation run), reveal that labeling evaluative judgements in microblogs is an inherently difficult task even for professional coders. These difficulties, however, can be alleviated by letting the annotators revise each other's decisions. Once rechecked, the experts can proceed with the annotation of further messages, staying at a fairly high level of agreement.

Details

Paper ID
lrec2016-main-181
Pages
pp. 1133-1141
BibKey
sidarenka-2016-potts
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • US

    Uladzimir Sidarenka

Links