Back to Main Conference 2004
LREC 2004main

An Annotation Scheme for a Rhetorical Analysis of Biology Articles

Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004)

DOI:10.63317/3622e2xb8ams

Abstract

In information extraction from scientific texts, it is crucially important to identify the unique contribution of the research. The task is complicated by the large number of statements made in each article that pertain to results, including reference to previous work and technical details. Simple keyword searches are helpful for a content-based analysis but fail to tell new results from other ones. We aim to approach the problem from a rhetorical perspective and give a 'zone analysis' (ZA) of texts in light of Teufel, Carletta & Moens (1999). We analyze a text into 'zones' with a shallow nesting based on the rhetorical status which each sequence of statements fit into and annotate the text correspondingly. Our current focus is on the molecular biology domain. In this paper, we propose an annotation scheme for ZA based on an empirical analysis of major online journals (EMBO, NAR, PNAS, and JCB), and illustrate how it works. Our scheme provides a way to differentiate the text in terms of the aspects of the author's own work (e.g. experimental procedure, findings, implications) and to identify a set of statements relating data and findings and therefore helps identify the author's new results and findings.

Details

Paper ID
lrec2004-main-294
Pages
N/A
BibKey
mizuta-collier-2004-annotation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-1-6
Conference
Fourth International Conference on Language Resources and Evaluation
Location
Lisbon, Portugal
Date
26 May 2004 28 May 2004

Authors

  • YM

    Yoko Mizuta

  • NC

    Nigel Collier

Links