Back to Main Conference 2026
LREC 2026main

Generation of Instruction and Preference Dataset for Improving Japanese Instruction Following in LLMs

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3w8ceszaj7m9

Abstract

Instruction following, the ability to generate text that aligns with human intent, is a core capability of large language models (LLMs) for real-world applications. Instruction tuning is widely used to obtain this capability, but it requires large amounts of annotated data. To reduce the labor and cost of large-scale annotation, data augmentation using LLMs has been proposed as a promising approach. As this approach has primarily been applied to English datasets, its effectiveness in other languages, such as Japanese, remains unclear. In this paper, we propose an automatic pipeline for generating instruction and preference datasets in Japanese. The instruction dataset is created by expanding a manually annotated dataset using an LLM. The preference dataset is then constructed by adding LLM-generated negative examples to the instruction dataset. To ensure the quality of the datasets, instructions and responses are evaluated using LLM-as-a-Judge and ROUGE-L. Experimental results using supervised fine-tuning and direct preference optimization demonstrate that these synthetic datasets improve the instruction-following capability in Japanese.

Details

Paper ID
lrec2026-main-111
Pages
pp. 1435-1454
BibKey
moriyama-etal-2026-generation
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • KM

    Kei Moriyama

  • TK

    Takashi Kodama

  • KN

    Kouta Nakayama

Links