Back to Main Conference 2026
LREC 2026main

IHPP: A Paragraph-Level Dataset for Investigating the Pragmatics of Hyperpartisan Italian News

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/2sabeivqu9dt

Abstract

This study investigates the linguistic composition of hyperpartisan paragraphs in Italian news on climate change, Ukraine war, and immigration by publicly disclosing the dataset to ensure reproducibility. We introduce a new corpus, IHPP, of 356 articles, for a total of 4,861 paragraphs annotated for hyperpartisan news detection at the paragraph level and enriched with span-level annotations of six semantic-pragmatic linguistic traits: figurative speech, irony/sarcasm, epithet, as well as hyperbolic and loaded language. We hypothesized that these traits, while violating Gricean maxims, are key mechanisms of hyperpartisan rhetoric. To test this, we fine-tuned a set of mono- and multilingual BERT models for hyperpartisan detection and evaluated their incorporation in the embedding space. Then, we applied explainable techniques, e.g. Integrated Gradients and SHAP to analyze how models allocate attribution to normal and linguistic-trait tokens. Our result show that loaded language is the most discriminative trait. The dataset is released: https://github.com/MichJoM/IHPP-Climate.

Details

Paper ID
lrec2026-main-157
Pages
pp. 1998-2011
BibKey
maggini-etal-2026-ihpp
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • MM

    Michele Joshua Maggini

  • DB

    Davide Bassi

  • AV

    Angelo Valente

  • GD

    Gaël Dias

  • PG

    Pablo Gamallo

Links