Back to Main Conference 2026
LREC 2026main

APODICTUS: Automatic Processing of DICTionary Update candidateS

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)

DOI:10.63317/3rtgegtzmpmv

Abstract

Dictionaries have to be regularly updated. Some dictionary-makers gather proposals for updates of sense entries in internal databases. We automate the process of verifying and prioritizing such sense proposals, and facilitate their addition to a dictionary, by building a sophisticated processing pipeline relying on state-of-the-art language models. Our pipeline presents the first systematic, large-scale, and comprehensive solution for processing candidates for inclusion in a dictionary, which is tested in an industry-relevant context. We conduct several experiments to evaluate the pipeline and provide an annotated dataset for future work. Model performance is acceptable for words which are not yet in the dictionary, but low for in-dictionary words. Through an error analysis and model component ablation, we gain further insight on directions of future model improvements.

Details

Paper ID
lrec2026-main-924
Pages
pp. 11800-11812
BibKey
blessing-etal-2026-apodictus
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
978-2-493814-49-4
Conference
The Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Location
Palma, Mallorca, Spain
Date
11 May 2026 16 May 2026

Authors

  • FB

    Felix Blessing

  • JS

    Johannes S. Sax

  • JK

    Julian Kaufmann

  • WZ

    Wei Zhao

  • NA

    Nikolay Arefyev

  • DS

    Dominik Schlechtweg

Links