SALOMO: An Annotation Tool for Complex Annotation Tasks with a Large Number of Labels
Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)
Abstract
Manual annotation of linguistic units such as sentences with labels drawn from a large inventory or taxonomy imposes an enormous cognitive load on human subjects. For our exemplary task, we devised a taxonomy of media bias with 37 categories. Selecting the appropriate category (or none) for thousands of news sentences is likely to be tiring and error-prone for humans. To address these type of annotation tasks involving large numbers of labels, we present SALOMO, an annotation tool that pre-selects labels by letting a committee of LLMs make decisions. Human annotators are then tasked mainly with resolving cases where the LLMs disagree. While our tool is independent of any particular task, we describe its design, present a short corpus annotated with a novel fine-grained taxonomy of news bias types as a concrete case study, and demonstrate experimentally both the significant time savings and workload reduction achieved with the pre-selection mechanism, as well as the strong bias it introduces toward the displayed selection. We also provide the mini-dataset of biased sentences and their associated bias types from our experiment.