Back to Main Conference 2004
LREC 2004main

Phonological Treebanks. Issues in Generation and Application

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/57jzgnu5b7b8

Abstract

The continuing popularity of XML as a data exchange format and the concurrent rise of treebanks as natural language resources within various domains of language processing have led us to extend their domain of application to phonological data. Typically, treebanks are a language resource that provides annotations of natural languages at various levels of structure and in this paper we present a tree-based format to capture phonological information at the syllable level, at the segment level and even including more fine-grained featural information. Two integrated modules in relation to phonological treebanks are described: the first uses a multilingual feature set to augment segment-annotated corpora in terms of a tree-based structure represented in XML. The second module allows these feature trees to be traversed and the data contained in it to be optimised in a purely data-driven manner.

Details

Paper ID
lrec2004-main-105
Pages
N/A
BibKey
neugebauer-wilson-2004-phonological
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • MN

    Moritz Neugebauer

  • SW

    Stephen Wilson

Links