HomeLREC 2026WorkshopsLLMS4SSHlrec2026-ws-llms4ssh-23
Back to LLMS4SSH 2026
LREC 2026workshop

Cross-Lingual Abstractive Keyphrase Generation for Historical Newspapers

Proceedings of Shaping Multilingual, Multimodal AI for the Social Sciences and Humanities (LLMs4SSH) @ LREC 2026

DOI:10.63317/2qijz2a9nwpd

Abstract

We investigate large language models (LLMs) for cross-lingual abstractive keyphrase generation from historical newspapers. The task consists of producing a small set of English keyphrases for articles written in German, French, and Luxembourgish, combining translation, abstraction, and normalization. We conduct a human-centered pilot study comparing model outputs using human selections, LLM-as-judge assessments, and inter-annotator agreement analysis, followed by a medium-scale application to multilingual data from the Impresso corpus. Results show that LLM-generated keyphrases can support semantic enrichment and exploratory analysis of historical collections, while highlighting the subjective and methodologically challenging nature of keyphrase evaluation.

Details

Paper ID
lrec2026-ws-llms4ssh-23
Pages
pp. 218-223
BibKey
clematide-etal-2026-cross
Editors
Arturo Montejo-Raez, Cristina Grisot, Joanna Blochowiak, Nikola Ljubešić, Elena Battaner, German Rigau
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of Shaping Multilingual, Multimodal AI for the Social Sciences and Humanities (LLMs4SSH) @ LREC 2026
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • SC

    Simon Clematide

  • JM

    Jenifer L. Meyer

  • JO

    Juri Opitz

  • ME

    Maud Ehrmann

  • KB

    Kaspar Beelen

Links