Back to Main Conference 2010
LREC 2010main

Inter-sentential Relations in Information Extraction Corpora

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2ghsg4fsyp85

Abstract

In natural language relationships between entities can asserted within a single sentence or over many sentences in a document. Many information extraction systems are constrained to extracting binary relations that are asserted within a single sentence (single-sentence relations) and this limits the proportion of relations they can extract since those expressed across multiple sentences (inter-sentential relations) are not considered. The analysis in this paper focuses on finding the distribution of inter-sentential and single-sentence relations in two corpora used for the evaluation of Information Extraction systems: the MUC6 corpus and the ACE corpus from 2003. In order to carry out this analysis we had to manually mark up all the management succession relations described in the MUC6 corpus. It was found that inter-sentential relations constitute 28.5% and 9.4% of the total number of relations in MUC6 and ACE03 respectively. This places upper bounds on the recall of information extraction systems that do not consider relations that are asserted across multiple sentences (71.5% and 90.6% respectively).

Details

Paper ID
lrec2010-main-621
Pages
N/A
BibKey
swampillai-stevenson-2010-inter
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • KS

    Kumutha Swampillai

  • MS

    Mark Stevenson

Links