Back to Main Conference 2012
LREC 2012main

Croatian Dependency Treebank: Recent Development and Initial Experiments

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012)

DOI:10.63317/2giqxg25nbf7

Abstract

We present the current state of development of the Croatian Dependency Treebank ― with special empahsis on adapting the Prague Dependency Treebank formalism to Croatian language specifics ― and illustrate its possible applications in an experiment with dependency parsing using MaltParser. The treebank currently contains approximately 2870 sentences, out of which the 2699 sentences and 66930 tokens were used in this experiment. Three linear-time projective algorithms implemented by the MaltParser system ― Nivre eager, Nivre standard and stack projective ― running on default settings were used in the experiment. The highest performing system, implementing the Nivre eager algorithm, scored (LAS 71.31 UAS 80.93 LA 83.87) within our experiment setup. The results obtained serve as an illustration of treebank's usefulness in natural language processing research and as a baseline for further research in dependency parsing of Croatian.

Details

Paper ID
lrec2012-main-418
Pages
pp. 1902-1906
BibKey
berovic-etal-2012-croatian
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-7-7
Conference
Eighth International Conference on Language Resources and Evaluation
Location
Istanbul, Turkey
Date
21 May 2012 27 May 2012

Authors

  • DB

    Daša Berović

  • ŽA

    Željko Agić

  • MT

    Marko Tadić

Links