Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken ‘in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods.

Resources

Details

Paper ID

lrec2018-main-525

Pages

N/A

DOI

10.63317/2b8fiqsdwe99

BibKey

gatt-etal-2018-face2text

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

AG
Albert Gatt
MT
Marc Tanti
AM
Adrian Muscat
PP
Patrizia Paggio
RF
Reuben A Farrugia
CB
Claudia Borg
KC
Kenneth P Camilleri
MR
Michael Rosner
Lv
Lonneke van der Plas

Links

URL

DOI