Back to Main Conference 2004
LREC 2004main

Semi-automatic Syntactic and Semantic Corpus Annotation with a Deep Parser

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/3rbv9caf8p4o

Abstract

We describe a semi-automatic method for linguistically rich corpus annotation using a broad-coverage deep parser to generate syntactic structure, semantic representation and discourse information for task-oriented dialogs. The parser-generated analyses are checked by trained annotators. Incomplete coverage and incorrect analyses are addressed through lexicon and grammar development, after which the dialogs undergo another cycle of parsing and checking. Currently we have 85% correct annotations in our emergency rescue task domain and 70% in our medication scheduling domain. This iterative process of corpus annotation allows us to create domain-specific gold-standard corpora for test suites and corpus-based experiments as part of general system development.

Details

Paper ID
lrec2004-main-460
Pages
N/A
BibKey
swift-etal-2004-semi
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • MS

    Mary D. Swift

  • MD

    Myroslava O. Dzikovska

  • JT

    Joel R. Tetreault

  • JA

    James F. Allen

Links