Back to Main Conference 2016
LREC 2016main

A Corpus of Literal and Idiomatic Uses of German Infinitive-Verb Compounds

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4ey2i53hgtgx

Abstract

We present an annotation study on a representative dataset of literal and idiomatic uses of German infinitive-verb compounds in newspaper and journal texts. Infinitive-verb compounds form a challenge for writers of German, because spelling regulations are different for literal and idiomatic uses. Through the participation of expert lexicographers we were able to obtain a high-quality corpus resource which offers itself as a testbed for automatic idiomaticity detection and coarse-grained word-sense disambiguation. We trained a classifier on the corpus which was able to distinguish literal and idiomatic uses with an accuracy of 85 %.

Details

Paper ID
lrec2016-main-135
Pages
pp. 836-841
BibKey
horbach-etal-2016-corpus
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • AH

    Andrea Horbach

  • AH

    Andrea Hensler

  • SK

    Sabine Krome

  • JP

    Jakob Prange

  • WS

    Werner Scholze-Stubenrecht

  • DS

    Diana Steffen

  • ST

    Stefan Thater

  • CW

    Christian Wellner

  • MP

    Manfred Pinkal

Links