HomeLREC 2020WorkshopsCOMPUTERMlrec2020-ws-computerm-13
Back to COMPUTERM 2020
LREC 2020workshop

TermEval 2020: TALN-LS2N System for Automatic Term Extraction

Proceedings of the 6th International Workshop on Computational Terminology

DOI:10.63317/3dxytca7ntt7

Abstract

Automatic terminology extraction is a notoriously difficult task aiming to ease effort demanded to manually identify terms in domain-specific corpora by automatically providing a ranked list of candidate terms. The main ways that addressed this task can be ranged in four main categories: (i) rule-based approaches, (ii) feature-based approaches, (iii) context-based approaches, and (iv) hybrid approaches. For this first TermEval shared task, we explore a feature-based approach, and a deep neural network multitask approach -BERT- that we fine-tune for term extraction. We show that BERT models (RoBERTa for English and CamemBERT for French) outperform other systems for French and English languages.

Details

Paper ID
lrec2020-ws-computerm-13
Pages
pp. 95-100
BibKey
hazem-etal-2020-termeval
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 6th International Workshop on Computational Terminology
Location
undefined, undefined
Date
11 May 2020 16 May 2020

Authors

  • AH

    Amir Hazem

  • MB

    Mérieme Bouhandi

  • FB

    Florian Boudin

  • BD

    Beatrice Daille

Links