Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
Scientific Article Section Classification (SASC) Dataset
Paper Fields
Click the edit button next to a field to report a correction.
Scientific Article Section Classification (SASC) Dataset
We introduce a novel, publicly available dataset of scientific publications specifically designed to focused on the structural and semantic analysis of their full texts. This collection comprises 4,896 scholarly articles processed using GROBID and self-defined parsers for its segmentation and section parsing. To ensure broad utility and diversity, the dataset includes (≈1,000) papers from 4 specialized research areas: Energy, Cancer, Neuroscience, and Transportation, supplemented by an additional ≈1,000 papers randomly selected from general scientific domains. This dataset is annotated using a newly-defined hierarchical taxonomy comprising 2 levels: the first level contains 9 semantic classes (coarse-grained), while the second level contains 47 semantic classes (fine-grained). All source documents were ethically and legally sourced via OpenAIRE, and the corpus is restricted exclusively to content available under open licenses. License verification was performed through cross-referencing publisher metadata, landing pages, and the Unpaywall database. This curated dataset provides a robust and domain-diverse resource, ideal for developing and evaluating NLP models that require training on hierarchical structure of scientific literature.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.