Back to Main Conference 2018
LREC 2018main

KRAUTS: A German Temporally Annotated News Corpus

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/4ap8wacz3r76

Abstract

In recent years, temporal tagging, i.e., the extraction and normalization of temporal expressions, has become a vivid research area. Several tools have been made available, and new strategies have been developed. Due to domain-specific challenges, evaluations of new methods should be performed on diverse text types. Despite significant efforts towards multilinguality in the context of temporal tagging, for all languages except English, annotated corpora exist only for a single domain. In the case of German, for example, only a narrative-style corpus has been manually annotated so far, thus no evaluations of German temporal tagging performance on news articles can be made. In this paper, we present KRAUTS, a new German temporally annotated corpus containing two subsets of news documents: articles from the daily newspaper Dolomiten and from the weekly newspaper Die Zeit. Overall, the corpus contains 192 documents with 1,140 annotated temporal expressions, and has been made publicly available to further boost research in temporal tagging.

Details

Paper ID
lrec2018-main-085
Pages
N/A
BibKey
strotgen-etal-2018-krauts
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • JS

    Jannik Strötgen

  • AM

    Anne-Lyse Minard

  • LL

    Lukas Lange

  • MS

    Manuela Speranza

  • BM

    Bernardo Magnini

Links