FEIDEGGER: A Multi-modal Corpus of Fashion Images and Descriptions in German

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

The availability of multi-modal datasets that pair images and textual descriptions of their content has been a crucial driver in progress of various text-image tasks such as automatic captioning and text-to-image retrieval. In this paper, we present FEIDEGGER, a new multi-modal corpus that focuses specifically on the domain of fashion items and their visual descriptions in German. We argue that such narrow-domain multi-modality presents a unique set of challenges such as fine-grained image distinctions and domain-specific language, and release this dataset to the research community to enable study of these challenges. This paper illustrates our crowdsourcing strategy to acquire the textual descriptions, gives an overview over the \dataset~dataset, and discusses possible use cases.

Resources

Details

Paper ID

lrec2018-main-070

Pages

N/A

DOI

10.63317/48kpkwvk9g69

BibKey

lefakis-etal-2018-feidegger

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

LL
Leonidas Lefakis
AA
Alan Akbik
RV
Roland Vollgraf

Links

URL

DOI