HomeLREC 2026WorkshopsNLP4ECOLOGYlrec2026-ws-nlp4ecology-11
Back to NLP4ECOLOGY 2026
LREC 2026workshop

Retrieving Floods without Floodlights: Topic Models as Binary Classifiers for Extreme Climate Events in German News

Proceedings of the 2nd Workshop on Ecology, Environment, and Natural Language Processing

DOI:10.63317/56axeoxcmhfk

Abstract

In studies of media coverage of extreme climate events, NLP methods have become indispensable for identifying relevant texts in large news databases. Still, enough annotated data to train accurate deep learning-based classifiers from scratch is often not available. Topic Models have the advantage of being both unsupervised and interpretable, but are typically used only for exploratory analysis or data characterisation. In this study, we investigate how to employ Topic Models as binary classifiers for refining the retrieval of relevant news about seven types of extreme climate events in the German media. Our method relies on the posterior distributions estimated by Topic Models to select relevant documents, without modifying their training procedure. Using an annotated sample to guide the evaluation, we show that the probabilities assigned to keywords used to query news databases can also be informative for selecting relevant topics and improve sample precision. We compare our results to a fine-tuned text embedding classifier and an open-weight LLM, discussing observed trade-offs, e.g. the LLM’s lowest precision. Moreover, we show that results are hazard-dependent, which speaks against considering climate events as a single category in NLP tasks.

Details

Paper ID
lrec2026-ws-nlp4ecology-11
Pages
pp. 115-134
BibKey
madureira-etal-2026-retrieving
Editors
Francesca Grasso, Valerio Basile, Cristina Bosco, Muhammad Okky Ibrohim, Maria Skeppstedt, Manfred Stede
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 2nd Workshop on Ecology, Environment, and Natural Language Processing
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • BM

    Brielen Madureira

  • MB

    Mariana Madruga de Brito

  • AN

    Andreas Niekler

Links