Back to Main Conference 2010
LREC 2010main

MACAQ : A Multi Annotated Corpus to Study how we Adapt Answers to Various Questions

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2dh4cr8nmt5w

Abstract

This paper presents a corpus of human answers in natural language collected in order to build a base of examples useful when generating natural language answers. We present the corpus and the way we acquired it. Answers correspond to questions with fixed linguistic form, focus, and topic. Answers to a given question exist for two modalities of interaction: oral and written. The whole corpus of answers was annotated manually and automatically on different levels including words from the questions being reused in the answer, the precise element answering the question (or information-answer), and completions. A detailed description of the annotations is presented. Two examples of corpus analyses are described. The first analysis shows some differences between oral and written modality especially in terms of length of the answers. The second analysis concerns the reuse of the question focus in the answers.

Details

Paper ID
lrec2010-main-208
Pages
N/A
BibKey
garcia-fernandez-etal-2010-macaq
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • AG

    Anne Garcia-Fernandez

  • SR

    Sophie Rosset

  • AV

    Anne Vilnat

Links