HomeLREC 2022WorkshopsMWElrec2022-ws-mwe-09
Back to MWE 2022
LREC 2022workshop

Automatic Detection of Difficulty of French Medical Sequences in Context

Proceedings of the 18th Workshop on Multiword Expressions @LREC2022

DOI:10.63317/4en557vz7mms

Abstract

Medical documents use technical terms (single or multi-word expressions) with very specific semantics. Patients may find it difficult to understand these terms, which may lower their understanding of medical information. Before the simplification step of such terms, it is important to detect difficult to understand syntactic groups in medical documents as they may correspond to or contain technical terms. We address this question through categorization: we have to predict difficult to understand syntactic groups within syntactically analyzed medical documents. We use different models for this task: one built with only internal features (linguistic features), one built with only external features (contextual features), and one built with both sets of features. Our results show an f-measure over 0.8. Use of contextual (external) features and of annotations from all annotators impact the results positively. Ablation tests indicate that frequencies in large corpora and lexicon are relevant for this task.

Details

Paper ID
lrec2022-ws-mwe-09
Pages
pp. 55-66
BibKey
koptient-grabar-2022-automatic
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
N/A
ISBN
N/A
Workshop
Proceedings of the 18th Workshop on Multiword Expressions @LREC2022
Location
undefined, undefined
Date
20 June 2022 25 June 2022

Authors

  • AK

    Anaïs Koptient

  • NG

    Natalia Grabar

Links