Back to Main Conference 2010
LREC 2010main

Semi-Automated Extension of a Specialized Medical Lexicon for French

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/4kvymsz8ok9j

Abstract

This paper describes the development of a specialized lexical resource for a specialized domain, namely medicine. First, in order to assess the linguistic phenomena that need to be adressed, we based our observation on a large collection of more than 300'000 terms, organised around conceptual identifiers. Based on these observations, we highlight the specificities that such a lexicon should take into account, namely in terms of inflectional and derivational knowledge. In a first experiment, we show that general resources lack a large part of the words needed to process specialized language. Secondly, we describe an experiment to feed semi-automatically a medical lexicon and populate it with inflectional information. This experiment is based on a semi-automatic methods that tries to acquire inflectional knowledge from frequent endings of words recorded in existing lexicon. Thanks to this, we increased the coverage of the target vocabulary from 14.1% to 25.7%.

Details

Paper ID
lrec2010-main-288
Pages
N/A
BibKey
cartoni-zweigenbaum-2010-semi
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • BC

    Bruno Cartoni

  • PZ

    Pierre Zweigenbaum

Links