Back to Main Conference 2008
LREC 2008main

A Study of Parentheticals in Discourse Corpora - Implications for NLG Systems

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/2v8sbkqsqka2

Abstract

This paper presents a corpus study of parenthetical constructions in two different corpora: the Penn Discourse Treebank (PDTB, (PDTBGroup, 2008)) and the RST Discourse Treebank (Carlson et al., 2001). The motivation for the study is to gain a better understanding of the rhetorical properties of parentheticals in order to enable a natural language generation system to produce parentheticals as part of a rhetorically well-formed output. We argue that there is a correlation between syntactic and rhetorical types of parentheticals and establish two main categories: ELABORATION/EXPANSION-type NP-modifier parentheticals and NON-ELABORATION/EXPANSION-type VP- or S-modifier parentheticals. We show several strategies for extracting these from the two corpora and discuss how the seemingly contradictory results obtained can be reconciled in light of the rhetorical and syntactic properties of parentheticals as well as the decisions taken in the annotation guidelines.

Details

Paper ID
lrec2008-main-459
Pages
N/A
BibKey
banik-lee-2008-study
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • EB

    Eva Banik

  • AL

    Alan Lee

Links