HomeLREC 2026WorkshopsRESOURCEFULlrec2026-ws-resourceful-11
Back to RESOURCEFUL 2026
LREC 2026workshop

Link Prediction for Event Logs in the Process Industry

The Fourth Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL 2026)

DOI:10.63317/5jkczh48a2o9

Abstract

In the era of graph-based retrieval-augmented generation (RAG), link prediction is a significant preprocessing step for improving the quality of fragmented or incomplete domain-specific data for the graph retrieval. Knowledge management in the process industry uses RAG-based applications to optimize operations, ensure safety, and facilitate continuous improvement by effectively leveraging operational data and past insights. A key challenge in this domain is the fragmented nature of event logs in shift books, where related records are often kept separate, even though they belong to a single event or process. This fragmentation hinders the recommendation of previously implemented solutions to users, which is crucial in the timely problem-solving at live production sites. To address this problem, we develop a record linking model, which we define as a cross-document coreference resolution (CDCR) task. Record linking adapts the task definition of CDCR and combines two state-of-the-art CDCR models with the principles of natural language inference (NLI) and semantic text similarity (STS) to perform link prediction. The evaluation shows that our record linking model outperformed the best versions of our baselines, i.e., NLI and STS, by 28 (11.43 p) and 27.4 (11.21 p), respectively. Our work demonstrates that common NLP tasks can be combined and adapted to a domain-specific setting of the German process industry, improving data quality and connectivity in shift logs.

Details

Paper ID
lrec2026-ws-resourceful-11
Pages
pp. 107-118
BibKey
zhukova-etal-2026-link
Editors
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
The Fourth Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL 2026)
Location
Palma, Mallorca, Spain
Date
11 - 16 May 2026

Authors

  • AZ

    Anastasia Zhukova

  • TW

    Thomas Walton

  • CL

    Christian E. Lobmüller

  • BG

    Bela Gipp

Links