MMQA: A Multi-domain Multi-lingual Question-Answering Framework for English and Hindi

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

In this paper, we assess the challenges for multi-domain, multi-lingual question answering, create necessary resources for benchmarking and develop a baseline model. We curate 500 articles in six different domains from the web. These articles form a comparable corpora of 250 English documents and 250 Hindi documents. From these comparable corpora, we have created 5; 495 question-answer pairs with the questions and answers, both being in English and Hindi. The question can be both factoid or short descriptive types. The answers are categorized in 6 coarse and 63 finer types. To the best of our knowledge, this is the very first attempt towards creating multi-domain, multi-lingual question answering evaluation involving English and Hindi. We develop a deep learning based model for classifying an input question into the coarse and finer categories depending upon the expected answer. Answers are extracted through similarity computation and subsequent ranking. For factoid question, we obtain an MRR value of 49:10% and for short descriptive question, we obtain a BLEU score of 41:37%. Evaluation of question classification model shows the accuracies of 90:12% and 80:30% for coarse and finer classes, respectively.

Resources

Details

Paper ID

lrec2018-main-440

Pages

N/A

DOI

10.63317/34qh4vbtx4di

BibKey

gupta-etal-2018-mmqa

Editors

Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

79-10-95546-00-9

Conference

Eleventh International Conference on Language Resources and Evaluation

Location

Miyazaki, Japan

Date

7 - 12 May 2018

Authors

DG
Deepak Gupta
SK
Surabhi Kumari
AE
Asif Ekbal
PB
Pushpak Bhattacharyya

Links

URL

DOI