Large Language Models for Knowledge Graph Extraction: A Schema-Constrained Evaluation Framework

Proceedings of the Knowledge Graphs and Large Language Models Workshop (KG-LLM) @ LREC26

Abstract

Large language models enable zero-shot knowledge graph extraction from text, yet evaluation at the level of complete typed graphs remains an open challenge. We present a schema-constrained evaluation framework that combines an explicit ontology of six entity types and 96 relation types with structured generation guided by schema-injected prompts. Supporting both single-step and two-step extraction modes, controlled inference settings, and repeated-run stability analysis, the framework enables systematic benchmarking of LLM-based graph construction under closed ontology constraints. Four large language models Gemini 3 Pro, GPT-5.1, Claude Opus 4.5, and Mistral 7B are evaluated on DocRED using entity and triple F1, schema adherence, and run consistency. Manual review reveals that automatic triple F1 systematically underestimates extraction quality, as a substantial portion of model-predicted triples are textually valid but absent from the incomplete gold annotations. The framework, prompts, and experimental outputs are publicly available for download and experimentation.