A Discourse-based Tool Series for Logical Validation of LLMs

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

Abstract

Large Language Models (LLMs) frequently produce fluent but unverifiable reasoning, resulting in potential hallucinations and faulty inferences. This study proposes a logic programming - based verification framework ValidLogic4LLM in which the reasoning expressed by an LLM is transformed into a logic program (LP), probabilistic LP, defeasible LP and abductive LP representing world knowledge and a given problem description—such as a patient health complaint. The LP formed by an LLM is executed within a symbolic reasoning engine, and the resulting inferences are compared to the LLM’s natural-language conclusions. The strength or probability of facts, clauses and arguments is computed based on discourse structure of text expressing these facts or arguments. Divergence between symbolic and neural reasoning outcomes indicates possible hallucination or inconsistency in the model’s internal logic.