Back to Home

Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

  1. Click the edit button next to a field to report a correction.
  2. Fill in the suggested correction value for each field you want to correct.
  3. Provide your name and email so we can contact you if needed.

Paper Information

lrec2026-main-611

A Scalable Pipeline for Novelty Detection in Skill Extraction Using Large Language Models

Paper Fields

Click the edit button next to a field to report a correction.

Title

A Scalable Pipeline for Novelty Detection in Skill Extraction Using Large Language Models

Abstract

The rapid evolution of the labor market requires skill ontologies to be continuously updated, but manually identifying emerging skills in job advertisements is highly labor-intensive. This paper presents a scalable, multi-stage pipeline for automated novelty detection in skill extraction. The system combines Large Language Models (LLMs) for candidate generation, a re-matching and threshold-based filtering module ("Turbo"), that compares candidates against the existing ontology, and a two-step aggregation process that merges string-based and embedding-based clustering. Experiments on Swiss job advertisement datasets using GPT-4o, Gemini-2.0-flash, and DeepSeek-V3 show that the pipeline effectively reduces noise and manual curation effort: Turbo filtering lowered false positives by 82%, and aggregation reduced the number of items requiring review by 97%. Among the tested models, Gemini-2.0-flash achieved the highest precision, reaching a novelty detection ratio of up to 73% in the qualitative evaluation. These findings demonstrate the pipeline’s potential as an efficient tool for maintaining dynamic skill ontologies.


Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.


PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Author Declaration *

Select at least one field to correct using the edit buttons above.