Back to Main Conference 2002
LREC 2002main

Using Parallel Corpora to enrich Multilingual Lexical Resources

Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002)

DOI:10.63317/22my2kcxk9c4

Abstract

This paper describes the use of a bilingual vector model for the automatic discovery of German translations of English terms. The model is built by analysing co-occurence patterns in a parallel corpus of English and German medical abstracts, a method also used for Cross- Lingual Information Retrieval. The model generates candidate German translations of English words using the cosine similarity measure between terms in the bilingual vector space. The correct translations could be added to UMLS, the multilingual dictionary in question. The accuracy of the translations is evaluated by measuring how many of the existing UMLS translations are correctly predicted by the vector translations. The model also detects synonymy, particularly acronyms. An online public demonstration of the model is available.

Details

Paper ID
lrec2002-main-103
Pages
N/A
BibKey
widdows-etal-2002-using
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
N/A
Conference
Third International Conference on Language Resources and Evaluation
Location
Las Palmas, Spain
Date
29 May 2002 31 May 2002

Authors

  • DW

    Dominic Widdows

  • BD

    Beate Dorow

  • CC

    Chiu-Ki Chan

Links