Back to Main Conference 2010
LREC 2010main

Extracting Surface Realisation Templates from Corpora

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/39wmbmhx9ye9

Abstract

In Natural Language Generation (NLG), template-based surface realisation is an effective solution to the problem of producing surface strings from a given semantic representation, but many applications may not be able to provide the input knowledge in the required level of detail, which in turn may limit the use of the available NLG resources. However, if we know in advance what the most likely output sentences are (e.g., because a corpus on the relevant application domain happens to be available), then corpus knowledge may be used to quickly deploy a surface realisation engine for small-scale applications, for which it may be sufficient to select a sentence (in natural language) that resembles the desired output, and then modify some or all of its constituents accordingly. In other words, the application may simply 'point to' an existing sentence in the corpus and specify only the changes that need to take place to obtain the desired surface string. In this paper we describe one such approach to surface realisation, in which we extract syntactically-structured templates from a target corpus, and use these templates to produce existing and modified versions of the target sentences by a combination of canned text and basic dependency-tree operations.

Details

Paper ID
lrec2010-main-494
Pages
N/A
BibKey
tadeu-etal-2010-extracting
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • TT

    Thiago D. Tadeu

  • Ed

    Eder M. de Novais

  • IP

    Ivandré Paraboni

Links