Back to Home

Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

  1. Click the edit button next to a field to report a correction.
  2. Fill in the suggested correction value for each field you want to correct.
  3. Provide your name and email so we can contact you if needed.

Paper Information

lrec2026-ws-dialres-26

HeptaTAX: A Neuro-Symbolic Pipeline and Benchmark for Classifying 16th-Century Heptanesian Notarial Acts

Paper Fields

Click the edit button next to a field to report a correction.

Title

HeptaTAX: A Neuro-Symbolic Pipeline and Benchmark for Classifying 16th-Century Heptanesian Notarial Acts

Abstract

This study originates in the investigation of lexical bundles and formulaic language within sixteenth-century Corfiot notarial documents. The observed functional variation across identical formulaic sequences motivated the development of a document classification framework designed to support the structural interpretation of such language. Given that 16th-century Corfiot notarial acts represent a rich, albeit understudied, dialectal resource, their systematic categorization into subgenres is essential for their full exploration. However, this task requires substantial manual work, while NLP tools for this task and dialect do not exist. In this paper, we attempt to take an initial step in this direction. First, we present a corpus of 1,088 notarial acts from 5 notaries spanning 1500-1567, a 3-tier annotation schema (17 core genres, extension subcategories, hybrid cross-cutting tags), and a 40-act benchmark with gold annotations at all three tiers. Then, we evaluate 12 LLMs across 4 architectures, zero-shot, few-shot, full-context and Neuro-Symbolic. For the latter, we introduce a symbolic engine comprising a set of deterministic rules for identifying discriminative legal formulae, whose output is then injected into the neural (LLM) engine. The results show that the NeSy architecture compresses the accuracy gap between stronger and weaker models from 47.5 pp to 12.5 pp, with the smallest model (Llama 3.1 8B) gaining 47.5% and matching frontier models that operate without symbolic support. Three models reach a ceiling of 72.5% on the core tier. However, consistent errors in procedurally dense material reveal the limits of lexical and formulaic cues for identifying legal effect, motivating the use of symbolic signals in the NeSy pipeline. Extension and hybrid classification remain open challenges, with best scores of ∼63% and ∼35% respectively.


Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.


PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Author Declaration *

Select at least one field to correct using the edit buttons above.