Back to Main Conference 2018
LREC 2018main

NoReC: The Norwegian Review Corpus

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/2mfhrz244a29

Abstract

This paper presents the Norwegian Review Corpus (NoReC), created for training and evaluating models for document-level sentiment analysis. The full-text reviews have been collected from major Norwegian news sources and cover a range of different domains, including literature, movies, video games, restaurants, music and theater, in addition to product reviews across a range of categories. Each review is labeled with a manually assigned score of 1--6, as provided by the rating of the original author. This first release of the corpus comprises more than 35,000 reviews. It is distributed using the CoNLL-U format, pre-processed using UDPipe, along with a rich set of metadata. The work reported in this paper forms part of the SANT initiative (Sentiment Analysis for Norwegian Text), a project seeking to provide open resources and tools for sentiment analysis and opinion mining for Norwegian.

Details

Paper ID
lrec2018-main-661
Pages
N/A
BibKey
velldal-etal-2018-norec
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • EV

    Erik Velldal

  • Lilja Øvrelid

  • EB

    Eivind Alexander Bergem

  • CS

    Cathrine Stadsnes

  • ST

    Samia Touileb

  • FJ

    Fredrik Jørgensen

Links