Back to Main Conference 2024
LREC-COLING 2024main

Mapping Work Task Descriptions from German Job Ads on the O*NET Work Activities Ontology

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

DOI:10.63317/3f52oduzbt3n

Abstract

This work addresses the challenge of extracting job tasks from German job postings and mapping them to the fine-grained work activities classification in the O*NET labor market ontology. By utilizing ontological data with a Multiple Negatives Ranking loss and integrating a modest volume of labeled job advertisement data into the training process, our top configuration achieved a notable precision of 70% for the best mapping on the test set, representing a substantial improvement compared to the 33% baseline delivered by a general-domain SBERT. In our experiments the following factors proved to be most effective for improving SBERT models: First, the incorporation of subspan markup, both during training and inference, supports accurate classification, by streamlining varied job ad task formats with structured, uniform ontological work activities. Second, the inclusion of additional occupational information from O*NET into training supported learning by contextualizing hierarchical ontological relationships. Third, the most significant performance improvement was achieved by updating SBERT models with labeled job ad data specifically addressing challenging cases encountered during pre-finetuning, effectively bridging the semantic gap between O*NET and job ad data.

Details

Paper ID
lrec2024-main-0963
Pages
pp. 11049-11059
BibKey
gnehm-clematide-2024-mapping
Editor
N/A
Publisher
European Language Resources Association (ELRA) and ICCL
ISSN
2522-2686
ISBN
979-10-95546-34-4
Conference
Joint International Conference on Computational Linguistics, Language Resources and Evaluation
Location
Turin, Italy
Date
20 May 2024 25 May 2024

Authors

  • AG

    Ann-Sophie Gnehm

  • SC

    Simon Clematide

Links