Back to Main Conference 2018
LREC 2018main

Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

DOI:10.63317/2dfkbvjdmnpb

Abstract

In this paper, we present a double annotation system for new handwritten historical documents. We have 25,250 pages of registers of the Italian Comedy of the 18th century containing a great variety and amount of information. A crowdsourcing platform has been set up in order to perform labeling and transcription of the documents. The main purpose is to grasp budget data from the all 18th century and to create a dedicated database for the domain's experts. In order to improve, help and accelerate the process, a parallel system has been designed to automatically process information. We focus on the titles field, segmenting them into lines and checking candidate transcripts. We have collected a base of 971 title lines.

Details

Paper ID
lrec2018-main-069
Pages
N/A
BibKey
granet-etal-2018-crowdsourcing
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
79-10-95546-00-9
Conference
Eleventh International Conference on Language Resources and Evaluation
Location
Miyazaki, Japan
Date
7 May 2018 12 May 2018

Authors

  • AG

    Adeline Granet

  • BH

    Benjamin Hervy

  • GR

    Geoffrey Roman-Jimenez

  • MH

    Marouane Hachicha

  • EM

    Emmanuel Morin

  • HM

    Harold Mouchère

  • SQ

    Solen Quiniou

  • GR

    Guillaume Raschia

  • FR

    Françoise Rubellin

  • CV

    Christian Viard-Gaudin

Links