Title

Querying dependency treebanks with XML

Authors

Gosse Bouam (Rijksuniversiteit Groningen)

Geert Kloosterman (Rijksuniversiteit Groningen)

Session

WP4: Corpus Annotation

Abstract

The need for manual editing during construction of a treebank may impose constraints on the representation of dependency trees which are not optimal for linguistic exploration. Using XML-technology it is possible to maintain the treebank both in a form suitable for editing and in a form suitable for linguistic exploration. By choosing a compact representation, we can use XPath directly as query language. We argue that, given an explicit encoding of string positions, this direct encoding of dependency trees as XML-trees can represent discontinuous constituents in a way that supports queries involving both dependency and linear order.

Keywords

XML, Xpath, Dependency trees, Treebank

Full Paper

54.pdf