HomeLREC 2026WorkshopsPOLITICALNLPlrec2026-ws-politicalnlp-12
Back to POLITICALNLP 2026
LREC 2026workshop

A Vocabulary Analysis of News Articles in Relation to the Political Orientation of Their Source and Their Thematic

Proceedings of the Second Workshop on Building Educational Applications Using NLP

DOI:10.63317/4ygr7uhixm78

Abstract

Understanding how political orientation influences lexical choices is essential for detecting bias and framing in news media. In this paper, we present a computational framework for identifying nouns whose interpretation varies across politically divergent newspapers. Using a large corpus of French news articles published in 2024, we categorize texts by topics and political orientation. We use contextual embeddings to cluster occurrences of nouns to detect semantic variations and dissimilarity among sources. This allows us to map semantic distances between newspapers and identify polarized or editorially marked lexical choices. Our results show that topics, polysemy, and editorial priorities contribute differently to lexical divergence. We discuss these findings and highlight how contextual embeddings can help reveal semantic biases that would remain invisible through frequency-based methods. We conclude by outlining perspectives for improving topic classification and the clustering method, exploring alternative divergence measures, conducting a qualitative analysis of our results, and extending the framework to other languages or genres.

Details

Paper ID
lrec2026-ws-politicalnlp-12
Pages
pp. 111-119
BibKey
cave-etal-2026-vocabulary
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the Second Workshop on Building Educational Applications Using NLP
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • LC

    Laurène Cave

  • GL

    Gaël Lejeune

Links