Back to Home

Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

  1. Click the edit button next to a field to report a correction.
  2. Fill in the suggested correction value for each field you want to correct.
  3. Provide your name and email so we can contact you if needed.

Paper Information

lrec2026-ws-kgllm-15

Towards Knowledge Graph-Grounded Evaluation of Agentic LLMs on Cybersecurity Capture-the-Flag Challenges

Paper Fields

Click the edit button next to a field to report a correction.

Title

Towards Knowledge Graph-Grounded Evaluation of Agentic LLMs on Cybersecurity Capture-the-Flag Challenges

Abstract

Evaluating Large Language Model (LLM) agents on complex multi-step cybersecurity tasks requires structured, reproducible evaluation rubrics. We present BraceGreen, a framework that formalizes Capture-the-Flag (CTF) attack paths as knowledge graphs and uses them as gold-standard rubrics for agentic LLM evaluation. Each node in our knowledge graphs represents an attack step annotated with MITRE ATT&CK tactics, goals, commands, expected outputs, and semantic outcomes, while edges encode prerequisites, dependencies, and alternative paths. Our LangGraph-based evaluation workflow employs LLM-as-judge with chain-of-thought reasoning to semantically compare agent predictions against knowledge graph-encoded alternatives. We contribute a benchmark of 7 CTF machines with knowledge graph annotations, three evaluation modes (command prediction, goal inference, anticipated result), and integration with live machine infrastructure via virtual machines and a MCP server. Our approach bridges the gap between unstructured CTF writeups and graph-structured evaluation rubrics.


Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.


PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Author Declaration *

Select at least one field to correct using the edit buttons above.