Back to Main Conference 2016
LREC 2016main

Information structure in the Potsdam Commentary Corpus: Topics

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/2qd3qvhe9cjw

Abstract

The Potsdam Commentary Corpus is a collection of 175 German newspaper commentaries annotated on a variety of different layers. This paper introduces a new layer that covers the linguistic notion of information-structural topic (not to be confused with `topic' as applied to documents in information retrieval). To our knowledge, this is the first larger topic-annotated resource for German (and one of the first for any language). We describe the annotation guidelines and the annotation process, and the results of an inter-annotator agreement study, which compare favourably to the related work. The annotated corpus is freely available for research.

Details

Paper ID
lrec2016-main-271
Pages
pp. 1718-1723
BibKey
stede-mamprin-2016-information
Editors
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 - 28 May 2016

Authors

  • MS

    Manfred Stede

  • SM

    Sara Mamprin

Links