Back to Main Conference 2026
LREC 2026main

RespondeoQA: A Benchmark for Bilingual Latin-English Question Answering

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/58p5htfv3nad

Abstract

We introduce a benchmark dataset for question answering and translation in bilingual Latin and English settings, containing about 7,800 question–answer pairs. The questions are drawn from Latin pedagogical sources, including exams, quizbowl-style trivia, and textbooks ranging from the 1800s to the present. After automated extraction, cleaning, and manual review, the dataset covers a diverse range of question types: knowledge- and skill-based, multihop reasoning, constrained translation, and mixed language pairs. To our knowledge, this is the first QA benchmark centered on Latin. As a case study, we evaluate three large language models–LLaMa 3, Qwen QwQ, and OpenAI’s o3-mini–finding that all perform worse on skill-oriented questions. Although the reasoning models perform better on scansion and literary-device tasks, they offer limited improvement overall. QwQ performs slightly better on questions asked in Latin, but LLaMa3 and o3-mini are more task dependent. This dataset provides a new resource for assessing model capabilities in a specialized linguistic and cultural domain, and the creation process can be easily adapted for other languages. The dataset is available at: [https://github.com/slanglab/RespondeoQA](https://github.com/slanglab/RespondeoQA)

Details

Paper ID
lrec2026-main-080
Pages
pp. 1031-1043
BibKey
hudspeth-etal-2026-respondeoqa
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • MH

    Marisa Hudspeth

  • PB

    Patrick J. Burns

  • BO

    Brendan O'Connor

Links