Back to Main Conference 2004
LREC 2004main
Using the NITE XML Toolkit on the Switchboard Corpus to Study Syntactic Choice: a Case Study
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)
Abstract
The NITE XML Toolkit (NXT) provides library support for working with multimodal language corpora. We describe our experiences in using it to study discourse effects on syntactic choice using the parsed Switchboard Corpus as a starting point, as a case study for others who may wish to adopt similar techniques using NXT or one of the other libraries that are beginning to emerge. We discuss conversion into the NXT data format; automatic annotation of markables and of constituent length; hand-annotation of markables for animacy information structure, and coreferential links; and data analysis.