HomeLREC 2026WorkshopsNSLPlrec2026-ws-nslp-28
Back to NSLP 2026
LREC 2026workshop

UniCite: A Dataset and Unified Hierarchical Taxonomy for Multi-Dimensional Citation Analysis

Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026

DOI:10.63317/472sud9pm8t5

Abstract

Research in Citation Context Analysis (CCA) has produced numerous taxonomic schemes that vary from three to 12+ categories, with different granularities and no mappings between frameworks, severely limiting systematic comparison and progress. Despite decades of study, CCA methods have largely relied on fragmented frameworks that treat citation tasks independently, ignoring systematic relationships between function classification, sentiment analysis, and importance assessment. To address these research gaps, we present three integrated contributions. First, we develop UniCite, a two-level taxonomy (six primary functions, 12 subcategories, two orthogonal dimensions) that systematically integrates three existing schemes. Second, we develop a comprehensive dataset of 4,017 citations combining established resources with 1,547 newly extracted citations from 2018-2024 publications, all manually annotated under our unified framework. Third, we demonstrate systematic task relationships through multi-task learning, achieving 21.1% relative improvement in subfunction classification over single-task approaches.

Details

Paper ID
lrec2026-ws-nslp-28
Pages
pp. 277-288
BibKey
mourky-etal-2026-unicite
Editors
Georg Rehm, Stefan Dietze, Danilo Dessi, Diana Maynard, Sonja Schimmler
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • AM

    Amina Mourky

  • EL

    Elena Leitner

  • JM

    Julian Moreno-Schneider

  • RA

    Raia Abu Ahmad

  • EB

    Ekaterina Borisova

  • GR

    Georg Rehm

Links