Back to Main Conference 2012
LREC 2012main

Integrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/45zzq4q9qhsa

Abstract

The present paper tackles the issue of PoS tag conversion within the framework of a distributed web service platform for the automatic creation of language resources. PoS tagging is now considered a """"solved problem""""; yet, because of the differences in the tagsets, interchange of the various PoS tagger available is still hampered. In this paper we describe the implementation of a pos-tagged-corpus converter, which is needed for chaining together in a workflow the Freeling PoS tagger for Italian and the DESR dependency parser, given that these two tools have been developed independently. The conversion problems experienced during the implementation, related to the properties of the different tagsets and of tagset conversion in general, are discussed together with the heuristics implemented in the attempt to solve them. Finally, the converter is evaluated by assessing the impact of conversion on the performance of the dependency parser. From this we learn that in most cases parsing errors are due to actual tagging errors, and not to conversion itself. Besides, information on accuracy loss is an important feature in a distributed environment of (NLP) services, where users need to decide which services best suit their needs.

Details

Paper ID
lrec2012-main-422
Pages
pp. 2125-2131
BibKey
rubino-etal-2012-integrating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • FR

    Francesco Rubino

  • FF

    Francesca Frontini

  • VQ

    Valeria Quochi

Links