Back to Main Conference 2014
LREC 2014main

The Norwegian Dependency Treebank

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014)

DOI:10.63317/2o39r7gos22f

Abstract

The Norwegian Dependency Treebank is a new syntactic treebank for Norwegian Bokmäl and Nynorsk with manual syntactic and morphological annotation, developed at the National Library of Norway in collaboration with the University of Oslo. It is the first publically available treebank for Norwegian. This paper presents the core principles behind the syntactic annotation and how these principles were employed in certain specific cases. We then present the selection of texts and distribution between genres, as well as the annotation process and an evaluation of the inter-annotator agreement. Finally, we present the first results of data-driven dependency parsing of Norwegian, contrasting four state-of-the-art dependency parsers trained on the treebank. The consistency and the parsability of this treebank is shown to be comparable to other large treebank initiatives.

Details

Paper ID
lrec2014-main-273
Pages
pp. 789-795
BibKey
solberg-etal-2014-norwegian
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-8-4
Conference
Ninth International Conference on Language Resources and Evaluation
Location
Reykjavik, Iceland
Date
26 May 2014 31 May 2014

Authors

  • PS

    Per Erik Solberg

  • AS

    Arne Skjærholt

  • Lilja Øvrelid

  • KH

    Kristin Hagen

  • JJ

    Janne Bondi Johannessen

Links