HuNeBR: A Multitask Benchmark to Evaluate LLMs’ Understanding of Northeastern Brazilian Portuguese Humor

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

DOI:10.63317/22b3hhj77udn

Abstract

Humor recognition is a major challenge in Natural Language Processing (NLP) due to its subtle and context-dependent nature. Despite advances, Large Language Models (LLMs) still struggle with this task, especially in Brazilian Portuguese, where no dedicated benchmarks exist. This paper presents HuNeBR, a new benchmark of 475 annotated humorous texts from Northeastern Brazilian comedians. The benchmark evaluates LLMs on three tasks: identifying punchlines, classifying texts into eight comic styles, and explaining humor. This is the first benchmark to evaluate LLMs on the in-depth interpretation of humorous texts in Brazilian Portuguese, going beyond the binary tasks of traditional humor benchmarks. Both general-purpose and Portuguese-specialized LLMs were evaluated under zero-shot and few-shot settings. The findings indicate that LLMs perform very well at identifying punchlines, show inconsistent results in classifying comic styles, and produce humor interpretations that mostly align with human judgments. Among the models assessed, general-purpose multilingual systems like GPT-4 and Gemini 2.5 Flash achieved the top overall performance, whereas Sabiá 3.1, a model specialized in Brazilian Portuguese, demonstrated competitive results across all three tasks, highlighting the value of locally trained models in capturing linguistic and cultural subtleties.

Resources

Details

Paper ID

lrec2026-ws-sigul-30

Pages

pp. 299-311

DOI

10.63317/22b3hhj77udn

BibKey

gama-etal-2026-hunebr

Editors

Atul Kr. Ojha, Sakriani Sakti, Claudia Soria, Maite Melero, John P. McCrae, Constantine Lignos, Chao-Hong Liu, German Rigau Claramunt, Georg Rehm

Publisher

European Language Resources Association (ELRA)

ISSN

N/A

ISBN

N/A

Workshop

Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI, and DCLRL "Towards Inclusivity and Equality: Language Resources and Technologies for Under-Resourced and Endangered Languages

Location

Palma, Mallorca, Spain

Date

11 - 16 May 2026

Authors

JG
José Matheus do Nascimento Gama
DM
David Candeia Maia
LM
Leandro Balby Marinho
FM
Fabio Morais
JB
João Brunet

Links

URL

DOI