Back to Main Conference 2026
LREC 2026main

Linking Rationale to Decision on Internet Standards: A Retrieval-Based Approach Using Synthetic Data

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3szh4omfcsxb

Abstract

The Internet Engineering Task Force (IETF) develops Internet-Drafts (I-Ds) and Requests for Comments (RFCs) as formal specifications for Internet Protocols. While these documents capture finalized technical standards, the rich design rationales and deliberations that shape them are often buried in informal discussions across mailing lists. These discussions are rarely linked explicitly to the specifications they inform, making it difficult to trace the origins of specific design decisions. We address this gap by generating synthetic data that explicitly links discussion threads to their corresponding RFC/I‑D sections, producing roughly 350 000 such aligned instances. This data enables training a semantic embedding-based information retrieval (IR) system that, given an email discussion, retrieves the most relevant specification content. Our experiments show that this synthetic supervision helps models learn associations between informal discourse and formal documentation, though the task remains challenging due to the implicit and context-dependent nature of the links.

Details

Paper ID
lrec2026-main-568
Pages
pp. 7149-7162
BibKey
bian-etal-2026-linking
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • JB

    Jie Bian

  • MW

    Michael Welzl

Links