Back to Main Conference 2016
LREC 2016main

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

DOI:10.63317/3iuhfriggb5f

Abstract

In this paper, we present the automatic annotation of bibliographical references’ zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO’s features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model’s efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references’ zone.

Details

Paper ID
lrec2016-main-576
Pages
pp. 3632-3636
BibKey
htait-etal-2016-bilbo
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-9517408-9-1
Conference
Tenth International Conference on Language Resources and Evaluation
Location
Portorož, Slovenia
Date
23 May 2016 28 May 2016

Authors

  • AH

    Amal Htait

  • SF

    Sebastien Fournier

  • PB

    Patrice Bellot

Links