Back to Main Conference 2010
LREC 2010main

Corpus Based Analysis for Multilingual Terminology Entry Compounding

Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010)

DOI:10.63317/2k6s2ou5tqig

Abstract

This paper proposes statistical analysis methods for improvement of terminology entry compounding. Terminology entry compounding is a mechanism that identifies matching entries across multiple multilingual terminology collections. Bilingual or trilingual term entries are unified in compounded multilingual entry. We suggest that corpus analysis can improve entry compounding results by analysing contextual terms of given term in the corpus data. Proposed algorithm is described. It is implemented in an experimental setup. Results of experiment on compounding of Latvian and Lithuanian terminology resources are provided. These results encourage further research for different language pairs and in different domains.

Details

Paper ID
lrec2010-main-591
Pages
N/A
BibKey
vasiljevs-balodis-2010-corpus
Editor
N/A
Publisher
European Language Resources Association (ELRA)
ISSN
2522-2686
ISBN
2-9517408-6-7
Conference
Seventh International Conference on Language Resources and Evaluation
Location
Valletta, Malta
Date
17 May 2010 23 May 2010

Authors

  • AV

    Andrejs Vasiljevs

  • KB

    Kaspars Balodis

Links