Back to Main Conference 2008
LREC 2008main

Deriving Rhetorical Complexity Data from the RST-DT Corpus

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/4jo85ba86kjw

Abstract

This paper describes a study of the levels at which different rhetorical relations occur in rhetorical structure trees. In a previous empirical study (Williams and Reiter, 2003) of the RST-DT (Rhetorical Structure Theory Discourse Treebank) Corpus (Carlson et al., 2003), we noticed that certain rhetorical relations tended to occur more frequently at higher levels in a rhetorical structure tree, whereas others seemed to occur more often at lower levels. The present study takes a closer look at the data, partly to test this observation, and partly to investigate related issues such as the relative complexity of satellite and nucleus for each type of relation. One practical application of this investigation would be to guide discourse planning in Natural Language Generation (NLG), so that it reflects more accurately the structures found in documents written by human authors. We present our preliminary findings and discuss their relevance for discourse planning.

Details

Paper ID
lrec2008-main-332
Pages
N/A
BibKey
williams-power-2008-deriving
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • SW

    Sandra Williams

  • RP

    Richard Power

Links