HomeLREC 2026WorkshopsNSLPlrec2026-ws-nslp-21
Back to NSLP 2026
LREC 2026workshop

Normalizing Section Names and Structure of Scientific Articles

Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026

DOI:10.63317/5ftk2fxxf7jd

Abstract

The growing amount of scientific literature has increased the need for automatic methods that can retrieve, process, and exploit scholarly content. In this work, we explore section name normalization and hierarchy prediction for scientific articles using a two-level taxonomy. We compare independent, sequential classification models, and generative large language models on the SASC dataset. Results show that classification approaches, particularly sequential models that employ document-level context, consistently outperform generative methods. Incorporating section content is essential for fine-grained classification, while generative models remain limited in zero-shot settings. Our experiments highlight the importance of structure-aware modelling for large-scale scholarly document processing, and the importance of section normalization for the development of advanced research mapping and research assessment tools.

Details

Paper ID
lrec2026-ws-nslp-21
Pages
pp. 218-224
BibKey
duransilva-etal-2026-normalizing
Editors
Georg Rehm, Stefan Dietze, Danilo Dessi, Diana Maynard, Sonja Schimmler
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Natural Scientific Language Processing (NSLP) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • ND

    Nicolau Duran-Silva

  • JM

    Julian Moreno-Schneider

  • CP

    César A. Parra-Rojas

  • GR

    Georg Rehm

Links