The Fourth Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL 2026)
LREC 2026 Workshop
Lost in Translation: Repurposing semantic similarity benchmarks for evaluating lexical-semantic consistency in LLM-based machine translation
Quin Ye, Jelke Bloem
Bridging the Low Resource Gap in Historical Cryptology: A Multilingual Diachronic Synthetic Dataset for Reproducible Cryptanalysis
Micaella Bruton, Meriem Beloucif, Beáta Megyesi
Cultural Grounding in Swedish: Extending an Everyday Knowledge Benchmark for LLMs
Meriem Beloucif, Johan Sjons
Entity Linking for Faroese Using Large Language Models with Web Search
Annika Simonsen, Iben Nyholm Debess, Hafsteinn Einarsson
From Polyester Girlfriends to Blind Mice: Creating the First Pragmatics Understanding Benchmarks for Slovene
Mojca Brglez, Spela Vintar
SdQuAD: A Large Benchmark Question Answering Dataset for Low-resource Sindhi Language
Wazir Ali, Muhammad Rafay Shaikh, Nadia Ali, Amar Rehman
LLMs as Assistants for Data Annotation: Addressing Disagreement and Supporting Expert Processes
Mark Andrade, Bláithín Heffernan, Abigail Walsh, Sheila Castilho
Annotation Quality in Aspect-Based Sentiment Analysis: A Case Study Comparing Experts, Students, Crowdworkers, and Large Language Models
Niklas Donhauser, Jakob Fehle, Nils Constantin Hellwig, Markus Weinberger, Udo Kruschwitz, Christian Wolff
Cross-Lingual Mathematical Reasoning in LLMs: Evaluating Performance on Icelandic vs. English Problems
Hafsteinn Einarsson
Struct2Unstruct: Creating Tender NER Datasets from Structured Procurement Records using Large Language Models
Asim Abbas, Mark Lee, Niloofer Shanavas, Venelin Kovatchev, Mubashir Ali
Link Prediction for Event Logs in the Process Industry
Anastasia Zhukova, Thomas Walton, Christian E. Lobmüller, Bela Gipp
MultiZebraLogic: A Multilingual Logical Reasoning Benchmark
Sofie Bruun, Dan Saattrup Smart
Progressing beyond Art Masterpieces or Touristic Clichés: how to assess your LLMs for cultural alignment?
António Branco, João Ricardo Silva, Nuno Marques, Luis M. S. Gomes, Ricardo Campos, Raquel Sequeira, Sara Nerea, Rodrigo Silva, Miguel Marques, Rodrigo Duarte, Artur Putyato, Diogo Folques, Tiago Valente
Evaluating Large Language Model-based Natural Language Generation for Modular Dialog systems
Vincent Emmerling, Christoph Kowalski, Amelie Sophie Robrecht-Hilbig, Stefan Kopp
JobResQA: Semi-Automatic Multilingual Benchmark Creation for LLM Machine Reading Comprehension on Résumés and Job Descriptions
Casimiro Pio Carrino, Paula Estrella, Rabih Zbib, Carlos Escolano, Jose A. R. Fonollosa
Beyond English and Evasion: A Human-Annotated Multi-Domain Benchmark for High-Stakes LLM Safety Evaluation in Chinese
Wajdi Zaghouani, Kholoud Khalil Aldous, Yicheng Gao
A multilingual hallucination benchmark
Freja Thoresen, Dan Saattrup Smart
Exploring the similarities and differences between VLM-driven and traditional OCR for Historical Swedish Data
Martin Johansson, Selma Waginder, Dana Dannélls
Showing all 18 papers