JAPAS: A Benchmark and Neural Approach for Japanese Patent Support Relation Extraction
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Efficient analysis of patent literature is crucial for technological development and protecting intellectual property. A key task is verifying the “support requirement,” which mandates that the detailed description must fully describe the claimed invention. This requirement is fundamental to a patent’s validity. Manual verification is a labor-intensive process that demands technical and legal expertise, making automation highly desirable. However, research on this task has been hampered by two key challenges: (1) the absence of a public benchmark, and (2) the reliance of prior work on lexical matching, which fails to capture semantic equivalence. To address these issues, we introduce JAPAS, the first public benchmark for this task, comprising over 2,000 instances manually annotated for Japanese patents. Each instance is labeled with a claim span, a supporting description paragraph, a relation type, and the annotator’s confidence level. Using this benchmark, we also establish modern baselines that capture semantic similarity, such as embeddings and LLMs. Our experiments show that a fine-tuned Qwen3-14B model achieves an F1 score of 0.50, outperforming the conventional lexical-based baseline. This result, which demonstrates that the task is feasible yet challenging, highlights the utility of JAPAS as a research foundation and provides a performance target for future work.