Back to Main Conference 2002
LREC 2002main

LAperLA: an integrated graphical-linguistic System for old printed Latin Texts

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/3fzqgyjj3hn6

Abstract

LAperLA (Lettore Automatico per Libri Antichi) is a prototype for the automatic  recognition of Latin texts in old printed books. The strengths of the system are the neural architecture and the post-processing linguistic tool that is represented by an index of Latin forms (more than 500,000) and by a query management system which uses the information of the index to check and correct the interpreted words. The images have been taken from the text of "Contradicentium Medicorum" by Girolamo Cardano in the edition printed on 1663; the main textual material consists of a set of 40 image-files (11 for the training and 29 for testing) with a resolution of 118 DPI. We would like to point out that the  interpretation results produced on images chosen as benchmarks by LAperLA have been compared with Fine Reader 4.0 by Abby and Omnipage Pro 10 by Caere. FineReader reaches correctness percentage of 61.19%; Omnipage gets to 54.41%, while LAperLA recognises the 80.95% of words which increases with the aid of the specific linguistic module (93,22%). A very easy to use system interface has been developed not only for the training of the net, but also to select the parts of the image-files to be interpreted.

Details

Paper ID
lrec2002-main-025
Pages
N/A
BibKey
bozzi-2002-laperla
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • AB

    Andrea Bozzi

Links