Back to Main Conference 2010
LREC 2010main

Disambiguating Compound Nouns for a Dynamic HPSG Treebank of Wall Street Journal Texts

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2tur3omvhvun

Abstract

The aim of this paper is twofold. We focus, on the one hand, on the task of dynamically annotating English compound nouns, and on the other hand we propose disambiguation methods and techniques which facilitate the annotation task. Both the aforementioned are part of a larger on-going effort which aims to create HPSG annotation for the texts from theWall Street Journal (henceforward WSJ) sections of the Penn Treebank (henceforward PTB) with the help of a hand-written large-scale and wide-coverage grammar of English, the English Resource Grammar (henceforward ERG; Flickinger (2002)). As we show in this paper, such annotations are very rich linguistically, since apart from syntax they also incorporate semantics, which does not only ensure that the treebank is guaranteed to be a truly sharable, re-usable and multi-functional linguistic resource, but also calls for the necessity of a better disambiguation of the internal (syntactic) structure of larger units of words, such as compound nouns, since this has an impact on the representation of their meaning, which is of utmost interest if the linguistic annotation of a given corpus is to be further understood as the practice of adding interpretative linguistic information of the highest quality in order to give “added value” to the corpus.

Details

Paper ID
lrec2010-main-345
Pages
N/A
BibKey
kordoni-zhang-2010-disambiguating
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • VK

    Valia Kordoni

  • YZ

    Yi Zhang

Links