Decode the Law: Towards Legal Text Simplification with Large Language Models
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Legal documents are often verbose and structurally complex, posing significant barriers to public understanding and equitable access to justice. Despite growing interest in text simplification, efforts targeting the legal domain remain limited by a lack of robust, high-quality resources. In this paper, we address this gap by introducing SIMPLE-LAW, a curated benchmark dataset of over 6,000 aligned pairs of original and simplified legal passages, specifically constructed to facilitate research in legal text simplification by leveraging large language models (LLMs). We evaluate this dataset across both in-context learning and parameter-efficient fine-tuning paradigms using a range of state-of-the-art LLMs, with Unsloth variants of Mistral, LLaMA-3.2, Gemma, and Qwen-2.5. We assess performance using BERTScore, ROUGE, SARI, and a hallucination detection score, to capture both fidelity and readability. Results show that fine-tuned models significantly outperform in-context learners in terms of simplification quality and factual consistency. By offering a new dataset, rigorous evaluation, and baseline comparisons, our work provides a critical foundation for developing transparent and accessible AI systems in the legal domain.