Back to Main Conference 2026
LREC 2026main

Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/59f6s77tynig

Abstract

Abstractive summarization aims to generate concise summaries by creating new sentences, allowing for flexible rephrasing. However, this approach can be vulnerable to inaccuracies, particularly ‘hallucinations’ where the model introduces non-existent information. In this paper, we leverage the use of multimodal and multilingual sentence embeddings derived from pre-trained models such as LaBSE, SONAR, and BGE-M3, and feed them into a modified BART-based French model. A Named Entity Injection mechanism that appends tokenized named entities to the decoder input is introduced, in order to improve the factual consistency of the generated summary. Our novel framework, SBARThez, is applicable to both text and speech inputs and supports cross-lingual summarization; it shows competitive performance relative to token-level baselines, especially for low-resource languages, while generating more concise and abstract summaries.

Details

Paper ID
lrec2026-main-774
Pages
pp. 9873-9883
BibKey
hammoud-etal-2026-multimodal
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • CH

    Chaimae Chellaf El Hammoud

  • SM

    Salima Mdhaffar

  • YE

    Yannick Estève

  • SH

    Stéphane Huet

Links