Back to Main Conference 2016
LREC 2016main

A Corpus of Literal and Idiomatic Uses of German Infinitive-Verb Compounds

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/4ey2i53hgtgx

Abstract

We present an annotation study on a representative dataset of literal and idiomatic uses of German infinitive-verb compounds in newspaper and journal texts. Infinitive-verb compounds form a challenge for writers of German, because spelling regulations are different for literal and idiomatic uses. Through the participation of expert lexicographers we were able to obtain a high-quality corpus resource which offers itself as a testbed for automatic idiomaticity detection and coarse-grained word-sense disambiguation. We trained a classifier on the corpus which was able to distinguish literal and idiomatic uses with an accuracy of 85 %.

Details

Paper ID
lrec2016-main-135
Pages
pp. 836-841
BibKey
horbach-etal-2016-corpus
Editors
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 - 28 May 2016

Authors

  • AH

    Andrea Horbach

  • AH

    Andrea Hensler

  • SK

    Sabine Krome

  • JP

    Jakob Prange

  • WS

    Werner Scholze-Stubenrecht

  • DS

    Diana Steffen

  • ST

    Stefan Thater

  • CW

    Christian Wellner

  • MP

    Manfred Pinkal

Links