Back to Main Conference 2026
LREC 2026main

Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/443h4s9sm3gy

Abstract

Retrieval-augmented generation (RAG) is a technique in which a large language model (LLM) generates answers based on relevant documents retrieved from an external document collection. Existing RAG evaluation benchmarks often use public data, such as Wikipedia and news articles, as the external document collection. However, these data are highly likely to be already included in the LLM’s pre-training corpus, which may prevent an accurate evaluation of the model’s ability to generate answers based on the retrieved documents. In this study, we construct a Japanese RAG benchmark by having an LLM synthesize documents about non-existent entities and events and use this collection of synthetic documents as the search target. Since these synthetic documents are not included in the LLM’s training data, the ability to generate answers based on retrieved documents can be evaluated more accurately. In addition to the synthetic documents, the benchmark is composed of questions and correct answers, which are created using a combination of LLMs and human effort. We then evaluated and analyzed the RAG performance of existing LLMs using the constructed benchmark.

Details

Paper ID
lrec2026-main-589
Pages
pp. 7435-7445
BibKey
li-etal-2026-construction
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • SL

    Shengzhe Li

  • MO

    Masaya Ohagi

  • HT

    Hayato Tsukagoshi

  • AF

    Akihiko Fukuchi

  • TS

    Tomohide Shibata

  • DK

    Daisuke Kawahara

Links