Back to Main Conference 2010
LREC 2010main

Multilingual Corpus Development for Opinion Mining

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/56zzoispp9io

Abstract

Opinion Mining is a discipline that has attracted some attention lately. Most of the research in this field has been done for English or Asian languages, due to the lack of resources in other languages. In this paper we describe an approach of building a manually annotated multilingual corpus for the domain of product reviews, which can be used as a basis for fine-grained opinion analysis also considering direct and indirect opinion targets. For each sentence in a review, the mentioned product features with their respective opinion polarity and strength on a scale from 0 to 3 are labelled manually by two annotators. The languages represented in the corpus are English, German and Spanish and the corpus consists of about 500 product reviews per language. After a short introduction and a description of related work, we illustrate the annotation process, including a description of the annotation methodology and the developed tool for the annotation process. Then first results on the inter-annotator agreement for opinions and product features are presented. We conclude the paper with an outlook on future work.

Details

Paper ID
lrec2010-main-476
Pages
N/A
BibKey
schulz-etal-2010-multilingual
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • JS

    Julia Maria Schulz

  • CW

    Christa Womser-Hacker

  • TM

    Thomas Mandl

Links