Back to Main Conference 2008
LREC 2008main

Unsupervised Resource Creation for Textual Inference Applications

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/3nk8p7hsan2f

Abstract

This paper explores how a battery of unsupervised techniques can be used in order to create large, high-quality corpora for textual inference applications, such as systems for recognizing textual entailment (TE) and textual contradiction (TC). We show that it is possible to automatically generate sets of positive and negative instances of textual entailment and contradiction from textual corpora with greater than 90% precision. We describe how we generated more than 1 million TE pairs - and a corresponding set of and 500,000 TC pairs - from the documents found in the 2 GB AQUAINT-2 newswire corpus.

Details

Paper ID
lrec2008-main-224
Pages
N/A
BibKey
bensley-hickl-2008-unsupervised
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • JB

    Jeremy Bensley

  • AH

    Andrew Hickl

Links