Back to Main Conference 2008
LREC 2008main

ParsCit: an Open-source CRF Reference String Parsing Package

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008)

DOI:10.63317/3pp9nrrhs7x9

Abstract

We describe ParsCit, a freely available, open-source implementation of a reference string parsing package. At the core of ParsCit is a trained conditional random field (CRF) model used to label the token sequences in the reference string. A heuristic model wraps this core with added functionality to identify reference strings from a plain text file, and to retrieve the citation contexts. The package comes with utilities to run it as a web service or as a standalone utility. We compare ParsCit on three distinct reference string datasets and show that it compares well with other previously published work.

Details

Paper ID
lrec2008-main-291
Pages
N/A
BibKey
councill-etal-2008-parscit
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-4-0
Conference
Sixth International Conference on Language Resources and Evaluation
Location
Marrakech, Morocco
Date
28 May 2008 30 May 2008

Authors

  • IC

    Isaac Councill

  • CG

    C. Lee Giles

  • MK

    Min-Yen Kan

Links