Extracting Structured Scholarly Information from the Machine Translation Literature

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Abstract

Understanding the experimental results of a scientific paper is crucial to understanding its contribution and to comparing it with related work. We introduce a structured, queryable representation for experimental results and a baseline system that automatically populates this representation. The representation can answer compositional questions such as: Which are the best published results reported on the NIST 09 Chinese to English dataset? and What are the most important methods for speeding up phrase-based decoding? Answering such questions usually involves lengthy literature surveys. Current machine reading for academic papers does not usually consider the actual experiments, but mostly focuses on understanding abstracts. We describe annotation work to create an initial hscientific paper; experimental results representationi corpus. The corpus is composed of 67 papers which were manually annotated with a structured representation of experimental results by domain experts. Additionally, we present a baseline algorithm that characterizes the difficulty of the inference task.

Resources

Details

Paper ID

lrec2016-main-067

Pages

pp. 421-425

DOI

10.63317/44y7vm9kzhxx

BibKey

choi-etal-2016-extracting

Editors

Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis

Publisher

European Language Resources Association (ELRA)

ISSN

2522-2686

ISBN

978-2-9517408-9-1

Conference

Tenth International Conference on Language Resources and Evaluation

Location

Portorož, Slovenia

Date

23 - 28 May 2016

Authors

EC
Eunsol Choi
MH
Matic Horvat
JM
Jonathan May
KK
Kevin Knight
DM
Daniel Marcu

Links

URL

DOI